Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.
Yeah, that’s just flat out wrong
Hallucinations happen when there’s gaps in the training data and it’s just statistically picking what’s most likely to be next. It becomes incomprehensible when the model breaks down and doesn’t know where to go. However, the model doesn’t see a difference between hallucinating nonsense and a coherent sentence. They’re exactly the same to the model.
The model does not learn or understand anything. It statistically knows what the next word is. It doesn’t need to have seen something before to know that. It doesn’t understand what it’s outputting, it’s just outputting a long string that is gibberish to it.
I have formal training in AI and 90%+ of what I see people claiming AI can do is a complete misunderstanding of the tech.