crossposted from !chatgpt@lemdro.id
Yeah this seems like a really tough problem with LLMs. From memory OpenAI have said they are hoping to see a big improvement next year which is a pretty long time given the rapid pace of everything else in the AI space.
I really hope they or others can make some big strides here because it really limits the usefulness of these models.
really limits the usefulness of these models.
The whole problem I have is the models are rewarded/refined for believability and not for accuracy.
Once there is enough LLM generated shit on the web, it will be used (most likely inadvertently) to train newer LLMs and we will be in a garbage in - garbage out deluge of accurate sounding bullshit that much of the web will become useless.
Yeah 100% with you on that. I think the folks building these things are also aware of this issue and maybe that’s one of the reasons why ChatGPTs training set still ends in 2021. We’ll have to wait and see what new solutions and techniques come along but for now I think we’re going to be stuck with this problem for a while.