AI company Anthropic agrees to pay $1.5B to settle lawsuit with authors

MicroWave@lemmy.world · 5 days ago

AI company Anthropic agrees to pay $1.5B to settle lawsuit with authors

FlowVoid@lemmy.world · edit-2 4 days ago

It’s true that a new model can be initialized from an older one, but it will never outperform the older one unless it is given actual training data (not necessarily the same training data used previously).

Kind of like how you can learn ancient history from your grandmother, but you will never know more ancient history than your grandmother unless you do some independent reading.

GissaMittJobb@lemmy.ml · 4 days ago

I think we’re in agreement with each other? The old model has the old training data, and then you train a new one on that model with new training data, right?

FlowVoid@lemmy.world · edit-2 4 days ago

No, the old model does not have the training data. It only has “model weights”. You can conceptualize those as the abstract rules that the old model learned when it read the training data. By design, they are not supposed to memorize their training data.

To outperform the old model, the new model needs more than what the old model learned. It needs primary sources, ie the training data itself. Which is going to be deleted.

GissaMittJobb@lemmy.ml · 4 days ago

No, the old model does not have the training data. It only has “model weights”. You can conceptualize those as the abstract rules that the old model learned when it read the training data. By design, they are not supposed to memorize their training data.

I expressed myself poorly, this is what I meant - it has the “essence” of the training data, but of course not the verbatim training data.

To outperform the old model, the new model needs more than what the old model learned. It needs primary sources, ie the training data itself. Which is going to be deleted.

I wonder how valuable in relative terms the old training data is to the process, compared to just the new training data. I can’t answer it, but it would be interesting to know.

FlowVoid@lemmy.world · 4 days ago

A new model needs training data, it doesn’t matter if the data is new or old. But generally, a more advanced model needs more training data, so AI devs generally need at least some new training data.