You’re humanizing the software too much. Comparing software to human behavior is just plain wrong. GPT can’t even reason properly yet. I can’t see this as anything other than a more advanced collage process.
Open used intellectual property without consent of the owners. Major fucked.
If ‘anybody’ does anything similar to tracing, copy&pasting or even sampling a fraction of another person’s imagery or written work, that anybody is violating copyright.
If ‘anybody’ does anything similar to tracing, copy&pasting or even sampling a fraction of another person’s imagery or written work, that anybody is violating copyright.
Ok, but tracing is literally a part of the human learning process. If you trace a work and sell it as your own that’s bad. If you trace a work to learn about the style and let that influence your future works that is what every artist already does.
The artistic process isn’t copyrighted, only the final result. The exact same standards can apply to AI generated work as already do to anything human generated.
i don’t know the specifics of the lawsuit but i imagine this would parallel piracy.
in a way you could say that Open has pirated software directly from multiple intellectual properties. Open has distributed software which emulates skills and knowledge. remember this is a tool, not an individual.
It’s not exactly the same thing, but here’s an article by Kit Walsh, who’s a senior staff attorney at the EFF explains how image generators work within the law. The two aren’t exactly the same, but you can see how the same ideas would apply. The EFF is a digital rights group who most recently won a historic case: border guards now need a warrant to search your phone.
Here are some excerpts:
First, copyright law doesn’t prevent you from making factual observations about a work or copying the facts embodied in a work (this is called the “idea/expression distinction”). Rather, copyright forbids you from copying the work’s creative expression in a way that could substitute for the original, and from making “derivative works” when those works copy too much creative expression from the original.
Second, even if a person makes a copy or a derivative work, the use is not infringing if it is a “fair use.” Whether a use is fair depends on a number of factors, including the purpose of the use, the nature of the original work, how much is used, and potential harm to the market for the original work.
And:
…When an act potentially implicates copyright but is a necessary step in enabling noninfringing uses, it frequently qualifies as a fair use itself. After all, the right to make a noninfringing use of a work is only meaningful if you are also permitted to perform the steps that lead up to that use. Thus, as both an intermediate use and an analytical use, scraping is not likely to violate copyright law.
I’d like to hear your thoughts.
You’re mystifying and mythologising humans too much. The learning process is very equivalent.
Well, there still a shit ton we don’t understand about human.
We do, however, understand everything about machine learning.
LOL
We understand less about how LLMs generate a single output than we do about the human brain. You clearly have no experience developing models.
sampling a fraction of another person’s imagery or written work.
So citing is a copyright violation? A scientific discussion on a specific text is a copyright violation? This makes no sense. It would mean your work couldn’t build on anything else, and that’s plain stupid.
Also to your first point about reasoning and advanced collage process: you are right and wrong. Yes an LLM doesn’t have the ability to use all the information a human has or be as precise, therefore it can’t reason the same way a human can. BUT, and that is a huge caveat, the inherit goal of AI and in its simplest form neural networks was to replicate human thinking. If you look at the brain and then at AIs, you will see how close the process is. It’s usually giving the AI an input, the AI tries to give the desired output, them the AI gets told what it should have looked like, and then it backpropagates to reinforce it’s process. This already pretty advanced and human-like (even look at how the brain is made up and then how AI models are made up, it’s basically the same concept).
Now you would be right to say “well in it’s simplest form LLMs like GPT are just predicting which character or word comes next” and you would be partially right. But in that process it incorporates all of the “knowledge” it got from it’s training sessions and a few valuable tricks to improve. The truth is, differences between a human brain and an AI are marginal, and it mostly boils down to efficiency and training time.
And to say that LLMs are just “an advanced collage process” is like saying “a car is just an advanced horse”. You’re not technically wrong but the description is really misleading if you look into the details.
And for details sake, this is what the paper for Llama2 looks like; the latest big LLM from Facebook that is said to be the current standard for LLM development: