Just need to get AI on that.
We need to embrace AI written content fully. Language is just a protocol for communication. If AI can flesh out the “packets” for us nicely in a way that fits what the receiving humans need to understand the communication then that’s a major win. Now I can ask AI to write me a nice letter and prompt it with a short bulleted list of what I want to say. Boom! Done, and time is saved.
The professional writers who used to slave over a blank Word document are now obsolete, just like the slide rule “computers” of old (the people who could solve complicated mathematics and engineering problems on paper).
Teachers who thought a hand written report could be used to prove that “education” has happened are now realizing that the idea was a crutch (it was 25 years ago too when we could copy/paste Microsoft Encarta articles and use as our research papers).
The technology really just shows us that our language capabilities really are just a means to an end. If a better means asrises we should figure out how to maximize it.
Or, because you can’t rely on computers to tell you the truth. Which is exactly the issue with LLMs as well.
I was mostly referring to the top comment. If you need to write an essay on Hamlet, the book can in fact not lie, because the entire exercise is to read the book and write about the contents of it.
But in general, you are right. (Which is why it is proper journalistic procedure to talk to multiple experts about a topic you write about. Also a good article does not present a forgone conclusion, but instead let’s readers form their own opinion on a topic by providing the necessary context and facts without the author’s judgement. LLMs as a one-stop-shop do not provide this and are less reliable than listening to a single expert would be)
OpenAI discontinued its AI Classifier, which was an experimental tool designed to detect AI-written text. It had an abysmal 26 percent accuracy rate.
If you ask this thing whether or not some given text is AI generated, and it is only right 26% of the time, then I can think of a real quick way to make it 74% accurate.
it seemed like a really weird decision for OpenAI to have an AI classifier in the first place. their whole business is to generate output that’s good enough that it can’t be distinguished from what a human might produce, and then they went and made a tool to try and point out where they failed.
I feel like this must stem from a misunderstanding of what 26% accuracy means, but for the life of me, I can’t figure out what it would be.
In statistics, everything is based off probability / likelihood - even binary yes or no decisions. For example, you might say “this predictive algorithm must be at least 95% statistically confident of an answer, else you default to unknown or another safe answer”.
What this likely means is only 26% of the answers were confident enough to say “yes” (because falsely accusing somebody of cheating is much worse than giving the benefit of the doubt) and were correct.
There is likely a large portion of answers which could have been predicted correctly if the company was willing to chance more false positives (potentially getting studings mistakenly expelled).
Looks like they got that number from this quote from another arstechnica article ”…OpenAI admitted that its AI Classifier was not “fully reliable,” correctly identifying only 26 percent of AI-written text as “likely AI-written” and incorrectly labeling human-written works 9 percent of the time”
Seems like it mostly wasn’t confident enough to make a judgement, but 26% it correctly detected ai text and 9% incorrectly identified human text as ai text. It doesn’t tell us how often it labeled AI text as human text or how often it was just unsure.
EDIT: this article https://arstechnica.com/information-technology/2023/07/openai-discontinues-its-ai-writing-detector-due-to-low-rate-of-accuracy/
Did human-generated content really become so low quality that it is distinguishable from AI-generated content?
Should I be able to detect whether or not this is an AI generated comment?