Researchers find just 250 malicious documents can leave LLMs vulnerable to backdoors

RvTV95XBeo@sh.itjust.works · 5 days ago

Researchers find just 250 malicious documents can leave LLMs vulnerable to backdoors

TechLich@lemmy.world · 5 days ago

600 million to 13 billion parameters? Those are very small models… Most major LLMs are at least 600 billion, if not getting into the trillion parameter territory.

Not particularly surprising given you don’t need a huge amount of data to fine tune those kinds of models anyway.

Still cool research and poisoning is a real problem. Especially with deceptive alignment being possible. It would be cool to see it tested on a larger model but I guess it would be super expensive to train one only for it to be shit because you deliberately poisoned it. Safety research isn’t going to get the same kind of budget as development. :(

Sekoia@lemmy.blahaj.zone · 5 days ago

If you take 600 billion and scale naïvely, that’s still just ~10k documents, which isn’t that many all told

CapillaryUpgrade@lemmy.sdf.org · 5 days ago

Especially if we use that new LLM thing to generate them!

Bgugi@lemmy.world · 5 days ago

Which is pretty decent, considering most humans are only one malicious document away from getting poisoned.

eleijeep@piefed.social · 5 days ago

Link to paper: https://arxiv.org/abs/2510.07192