It’s important to remember that humans also often give false confessions when interrogated, especially when under duress. LLMs are noted as being prone to hallucination, and there’s no reason to expect that they hallucinate less about their own guilt than about other topics.
Quite true. nonetheless there are some very interesting responses here. this is just the summary I questioned the AI for a couple of hours some of the responses were pretty fascinating, and some question just broke it’s little brain. There’s too much to screen shot, but maybe I’ll post some highlights later.
I love the analogy of an LLM based chat bot to someone being interrogated. The distinct thing about LLMs right now though is that they will tell you what you think you want in the absence of knowledge even though you’ve applied no pressure to do so. That’s all they’re programmed to do.
LLMs are trained based on a zillion pieces of text; each of which was written by some human for some reason. Some bits were novels, some were blog posts, some were Wikipedia entries, some were political platforms, some were cover letters for job applications.
They’re prompted to complete a piece of text that is basically an ongoing role-playing session; where the LLM mostly plays the part of “helpful AI personality” and the human mostly plays the part of “inquisitive human”. However, it’s all mediated over text, just like in a classic Turing test.
Some of the original texts the LLMs were trained on were role-playing sessions.
Some of those role-playing sessions involved people pretending to be AIs.
Or catgirls, wolf-boys, elves, or ponies.
The LLM is not trying to answer your questions.
The LLM is trying to write its part of an ongoing Internet RP session, in which a human is asking an AI some questions.