An AI leaderboard suggests the newest reasoning models used in chatbots are producing less accurate results because of higher hallucination rates. Experts say the problem is bigger than that
No, at least not in the sense that “hallucination” is used in the context of LLMs. It is specifically used to differentiate between the two cases you jumbled together: outputting correct information (as is represented in the training data) vs outputting “made-up” information.
A language model doesn’t “try” anything, it does what it is trained to do - predict the next token, yes, but that is not hallucination, that is the training objective.
Also, though not widely used, there are other types of LLMs, e.g. diffusion-based ones, which actually do not use a next token prediction objective and rather iteratively predict parts of the text in multiple places at once (Llada is one such example). And, of course, these models also hallucinate a bunch if you let them.
Redefining a term to suit some straw man AI boogeyman hate only makes it harder to properly discuss these issues.
Well by design ai is always hallucinating. Lol. That is how they work. Basically trying to hallucinate and predict the next word / token.
No, at least not in the sense that “hallucination” is used in the context of LLMs. It is specifically used to differentiate between the two cases you jumbled together: outputting correct information (as is represented in the training data) vs outputting “made-up” information.
A language model doesn’t “try” anything, it does what it is trained to do - predict the next token, yes, but that is not hallucination, that is the training objective.
Also, though not widely used, there are other types of LLMs, e.g. diffusion-based ones, which actually do not use a next token prediction objective and rather iteratively predict parts of the text in multiple places at once (Llada is one such example). And, of course, these models also hallucinate a bunch if you let them.
Redefining a term to suit some straw man AI boogeyman hate only makes it harder to properly discuss these issues.