Apparently, stealing other peopleâs work to create product for money is now âfair useâ as according to OpenAI because they are âinnovatingâ (stealing). Yeah. Move fast and break things, huh?
âBecause copyright today covers virtually every sort of human expressionâincluding blogposts, photographs, forum posts, scraps of software code, and government documentsâit would be impossible to train todayâs leading AI models without using copyrighted materials,â wrote OpenAI in the House of Lords submission.
OpenAI claimed that the authors in that lawsuit âmisconceive[d] the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.â
If weâre going by the number of pixels being viewed, then you have to use the same measure for both humans and AIs - and because AIs have to look at billions of images while humans do not, the AI still requires far more pixels than a human does.
And humans donât require the most modern art in order to learn to draw at all. Sure, if they want to compete with modern artists, they would need to look at modern artists (for which educational fair use exists, and again the quantity of art being used by the human for this purpose is massively lower than what an AI uses - a human does not need to consume billions of artworks from modern artists in order to learn what the current trends are). But a human could learn to draw, paint, sculpt, etc purely by only looking at public domain and creative commons works, because the process for drawing, say, the human figure (with the right number of fingers!) has not changed in hundreds of years. A human can also just⌠go outside and draw things they see themselves, because the sky above them and the tree across the street arenât copyrighted. And in fact, Iâd argue that a good artist should go out and find real things to draw.
OpenAIâs argument is literally that their AI cannot learn without using copyrighted materials in vast quantities - too vast for them to simply compensate all the creators. So it genuinely is not comparable to a human, because humans can, in fact, learn without using copyrighted material. If OpenAIâs argument is actually that their AI canât compete commercially with modern art without using copyrighted works, then they should be honest about that - but then theyâd be showing their hand, wouldnât they?
Sure, if they want to compete with modern artists, they would need to look at modern artists
Which is the literal goal of Dall-E, SD, etc.
But a human could learn to draw, paint, sculpt, etc purely by only looking at public domain and creative commons works
They could definitely learn some amount of skill, I agree. Iâd be very interested to see the best that an AI could achieve using only PD and CC content. It would be interesting. But youâd agree that it would look very different from modern art, just as an alien who has only been consuming earth media from 100+ years ago would be unable to relate to us.
the sky above them and the tree across the street arenât copyrighted.
Yeah, Iâd consider that PD/CC content that such an AI would easily have access to. But obviously the real sky is something entirely different from what is depicted in Starry Night, Star Wars, or H.P. Lovecraftâs description of the cosmos.
OpenAIâs argument is literally that their AI cannot learn without using copyrighted materials in vast quantities
Yeah, Iâd consider that a strong claim on their part; what they really mean is, itâs the easiest way to make progress in AI, and we wouldnât be anywhere close to where we are without it.
And you could argue âconvenient that it both saves them money, and generates money for them to do it this wayâ, but Iâd also point out that the alternative is they keep the trained models closed source, never using them publicly until they advance the tech far enough that theyâve literally figured out how to build/simulate a human brain that is able to learn as quickly and human-like as youâre describing. And then we find ourselves in a world where one or two corporations have this incredible proprietary ability that no one else has.
Personally, Iâd rather live in the world where the information about how to do all of this isnât kept for one or two corporations to profit from, I would rather live in the version where they publish their work publicly, early, and often, show that it works, and people are able to reproduce it, open source it, train their own models, and advance the technology in a space where anyone can use it.
You could hypothesize of a middle ground where they do the research, but arenât allowed to profit from it without licensing every bit of data they train on. But the reality of AI research is that it only happens to the extent that it generates revenue. Itâs been that way for the entire history of AI. Douglas Hofstadter has been asking deep important questions about AI as it relates to consciousness for like 60 years (ex. GEB, I am a Strange Loop), but thereâs a reason he didnât discover LLMs and tech companies did. Thatâs not to say his writings are meaningless, in fact I think theyâre more important than ever before, but he just wasnât ever going to get to this point with a small team of grad students, a research grant, and some public domain datasets.
So, itâs hard to disagree with OpenAI there, AI definitely wouldnât be where it is without them doing what theyâve done. And Iâm a firm believer that unless we figure our shit out with energy generation soon, the earth will be an uninhabitable wasteland. Weâre playing a game of climb the Kardashev scale, we opted for the âburn all the fossil fuels as fast as possibleâ strategy, and now weâre a the point where either spent enough energy fast enough to figure out the tech needed to survive this, or we suffocate on the fumes. The clock is ticking, and AI may be our best bet at saving the human race that doesnât involve an inordinate number of people dying.
OpenAI are not going to make the source code for their model accessible to all to learn from. This is 100% about profiting from it themselves. And using copyrighted data to create open source models would seem to violate the very principles the open source community stands for - namely that everybody contributes what they agree to, and everything is published under a licence. If the basis of an open source model is a vast quantity of training data from a vast quantity of extremely pissed off artists, at least some of the people working on that model are going to have a âare we the baddies?â moment.
The AI models are also never going to produce a solution to climate change that humans will accept. We already know what the solution is, but nobody wants to hear it, and expecting anyone to listen to ChatGPT and suddenly change their minds about using fossil fuels is ludicrous. And an AI that is trained specifically on knowledge about the climate and technologies that can improve it, with the purpose of innovating some hypothetical technology that will fix everything without humans changing any of their behaviour, categorically does not need the entire contents of ArtStation in its training data. AIs that are trained to do specific tasks, like the ones trained to identify new antibiotics, are trained on a very limited set of data, most of which is not protected by copyright and any that is can be easily licenced because the quantity is so small - and you donât see anybody complaining about those models!
OpenAI are not going to make the source code for their model accessible to all to learn from
OpenAI isnât the only company doing this, nor is their specific model the knowledge that Iâm referring to.
The AI models are also never going to produce a solution to climate change that humans will accept.
We already know what the solution is, but nobody wants to hear it
Then itâs not a solution. Thatâs like telling your therapist, âI know how to fix my relationship, my partner just wonât do it!â
expecting anyone to listen to ChatGPT and suddenly change their minds about using fossil fuels is ludicrous
Lol. Yeah, I agree, thatâs never going to work.
categorically does not need the entire contents of ArtStation in its training data.
Thatâs a strong claim to make. Regardless of the ethics involved, or the problems the AI can solve today, the fact is we seeing rapid advances in AI research as a direct result of these ethically dubious models.
In general, Iâm all for the capitalist method of artists being paid their fair share for the work they do, but on the flip side, I see a very possible mass extinction event on the horizon, which could cause suffering the likes of which humanity has never seen. If we assume that is the case, and we assume AI has a chance of preventing it, then I would prioritize that over peopleâs profits today. And I think itâs perfectly reasonable to say Iâm wrong.
And then thereâs the problem of actually enforcing any sort of regulation, which would be so much more difficult than people here are willing to admit. Thereâs basically nothing you can do even if you wanted to. Your Carlin example is exactly the defense a company would use: âI guess our AI just happened to create a movie that sounds just like Paul Blart, but we swear itâs never seen the film. Great minds think alike, I guess, and we sell only the greatest of mindsâ.
It isnât wrong to use copyrighted works for training. Let me quote an article by the EFF here:
and
What you want would swing the doors open for corporate interference like hindering competition, stifling unwanted speech, and monopolization like nothing weâve seen before. There are very good reasons people have these rights, and we shouldnât be trying to change this. Ultimately, itâs apparent to me, you are in favor of these things. That you believe artists deserve a monopoly on ideas and non-specific expression, to the detriment of anyone else. If Iâm wrong, please explain to me how.
If weâre going by the number of pixels being viewed, then you have to use the same measure for both humans and AIs - and because AIs have to look at billions of images while humans do not, the AI still requires far more pixels than a human does.
Humans benefit from years of evolutionary development and corporeal bodies to explore and interact with their world before theyâre ever expected to produce complex art. AI need huge datasets to understand patterns to make up for this disadvantage. Nobody pops out of the womb with fully formed fine motor skills, pattern recognition, understanding of cause and effect, shapes, comparison, counting, vocabulary related to art, and spatial reasoning. Datasets are huge and filled with image-caption pairs to teach models all of this from scratch. AI isnât human, and we shouldnât judge it against them, just like we donât judge boats on their rowing ability.
And humans donât require the most modern art in order to learn to draw at all. Sure, if they want to compete with modern artists, they would need to look at modern artists (for which educational fair use exists, and again the quantity of art being used by the human for this purpose is massively lower than what an AI uses - a human does not need to consume billions of artworks from modern artists in order to learn what the current trends are). But a human could learn to draw, paint, sculpt, etc purely by only looking at public domain and creative commons works, because the process for drawing, say, the human figure (with the right number of fingers!) has not changed in hundreds of years. A human can also just⌠go outside and draw things they see themselves, because the sky above them and the tree across the street arenât copyrighted. And in fact, Iâd argue that a good artist should go out and find real things to draw.
AI donât require most modern art in order to learn to make images either, but the range of expression would be limited, just like a humanâs in this situation. You can see this in cave paintings and early sculptures. They wouldnât be limited to this same degree, but you would still be limited.
It took us 100,000 years to get from cave drawings to Leonard Da Vinci. This is just another step for artists, like Camera Obscura was in the past. Itâs important to remember that early man was as smart as we are, they just lacked the interconnectivity to exchange ideas that we have.
I think the difference in artistic expression between modern humans and humans in the past comes down to the material available (like the actual material to draw with).
Humans can draw without seeing any image ever. Blind people can create art and draw things because we have a different understanding of the world around us than AI has. No human artist needs to look at a thousand or even at 1 picture of a banana to draw one.
The way AI sees and âunderstandsâ the world and how it generates an image is fundamentally different from how the human brain conveys the object banana into an image of a banana.
I think the difference in artistic expression between modern humans and humans in the past comes down to the material available (like the actual material to draw with).
That is definitely a difference, but even that is a kind of information shared between people, and information itself is what gives everyone something to build on. That gives them a basis on which to advance understanding, instead of wasting time coming up with the same things themselves every time.
Humans can draw without seeing any image ever. Blind people can create art and draw things because we have a different understanding of the world around us than AI has. No human artist needs to look at a thousand or even at 1 picture of a banana to draw one.
Humans donât need representations of things in images because they have the opportunity to interact with the genuine article, and in situations when that is impractical, they can still fall back on images to learn. Someone without sight from birth canât create art the same way a sighted person can.
The way AI sees and âunderstandsâ the world and how it generates an image is fundamentally different from how the human brain conveys the object banana into an image of a banana.
Thatâs the beauty of it all, despite that, these models can still output bananas.