Some argue that bots should be entitled to ingest any content they see, because people can.
You all really want a terminator-like future don’t you. Let’s use the most inflated possible wording, certainly there will be no issues
This is the best summary I could come up with:
Unfortunately, many people believe that AI bots should be allowed to grab, ingest and repurpose any data that’s available on the public Internet whether they own it or not, because they are “just learning like a human would.” Once a person reads an article, they can use the ideas they just absorbed in their speech or even their drawings for free.
Iris van Rooj, a professor of computational cognitive science at Radboud University Nijmegen in The Netherlands, posits that it’s impossible to build a machine to reproduce human-style thinking by using even larger and more complex LLMs than we have today.
NY Times Tech Columnist Farhad Manjoo made this point in a recent op-ed, positing that writers should not be compensated when their work is used for machine learning because the bots are merely drawing “inspiration” from the words like a person does.
“When a machine is trained to understand language and culture by poring over a lot of stuff online, it is acting, philosophically at least, just like a human being who draws inspiration from existing works,” Manjoo wrote.
In his testimony before a U.S. Senate subcommittee hearing this past July, Emory Law Professor Matthew Sag used the metaphor of a student learning to explain why he believes training on copyrighted material is usually fair use.
In fact, Microsoft, which is a major investor in OpenAI and uses GPT-4 for its Bing Chat tools, released a paper in March claiming that GPT-4 has “sparks of Artificial General Intelligence” – the endpoint where the machine is able to learn any human task thanks to it having “emergent” abilities that weren’t in the original model.
The original article contains 4,088 words, the summary contains 274 words. Saved 93%. I’m a bot and I’m open source!
Prove to me, right now, that you’re sentient. Or I won’t talk to you.
We don’t even know what sentience is, FFS.
There is a so-called “hard problem of consciousness”, although I take exception with calling it a problem.
The general problem is that you can’t really prove that you have subjective experience to others, and neither can you determine if others have it, or whether they merely act like they have it.
But, a somewhat obvious difference between AIs and humans is that AIs will never give you an answer that is not statistically derivable from their training dataset. You can give a human a book on a topic, and ask them about the topic, and they can give you answers that seem to be “their own conclusions” that are not explicitly from the book. Whether this is because humans have randomness injected into their reason, or they have imperfect reasoning, or some genuine animus of “free will” and consciousness, we cannot rightly say. But it is a consistent difference between the humans and the AIs.
The Monty Hall problem discussed in the article – in which AIs are asked to answer the Monty Hall problem, but they are given explicit information that violate the assumptions of the Monty Hall problem – is a good example of something where a human will tend to get it right, through creativity, while an AI will tend to get it wrong, due to statistical regression to the mean.
Well what an interesting question.
Let’s look at the definitions in Wikipedia:
Sentience is the ability to experience feelings and sensations.
Experience refers to conscious events in general […].
Feelings are subjective self-contained phenomenal experiences.
Alright, let’s do a thought experiment under the assumptions that:
- experience refers to the ability to retain information and apply it in some regard
- phenomenal experiences can be described by a combination of sensoric data in some fashion
- performance is not relevant, as for the theoretical possibility, we only need to assume that with infinite time and infinite resources the simulation of sentience through AI needs to be possible
AI works by telling it what information goes in and what goes out, and it therefore infers the same for new patterns of information and it adjusts to “how wrong it was” to approximate the correction. Every feeling in our body is either chemical or physical, so it can be measured / simulated through data input for simplicity sake.
Let’s also say for our experiment that the appropriate output it is to describe the feeling.
Now I think, knowing this, and knowing how good different AIs can already comment on, summarize or do any other transformative task on bigger texts that exposes them to interpretation of data, that it should be able to “express” what it feels. Let’s also conclude that based on the fact that everything needed to simulate feeling or sensation it can be described using different inputs of data points.
This brings me to the logical second conclusion that there’s nothing scientifically speaking of sentience that we wouldn’t be able to simulate already (in light of our assumptions).
Bonus: while my little experiment is only designed for theoretical possibility and we’d need some proper statistical calculations to know if this is practical in a realistic timeframe already and with a limited amount of resources, there’s nothing saying it can’t. I guess we have to wait for someone to try it to be sure.
Copyright and fair use are laws written for humans, to protect human creators and insure them the ability to profit from their creativity for a limited time, and to grant immunity to other humans for generally accepted uses of that work without compensation.
I agree that sentience is irrelevant, but whether the actors involved are human or not is absolutely relevant.
This article acts like it is a privilege to read a book or hear a song and it can be revoked…lol
Rights are irrelevant.
You made something. That doesn’t give u the right to say what can or can’t ingest it.
Under these rules all fanfic would be illegal.
Search engines would be illegal…can’t scan my website that’s copyrighted. Radio would be illegal. Random ppl listening to ur songs…that’s a nono.
AI does not learn as we do when ingesting information.
I read an article about a subject. I will forget some of it. I will misunderstand some of it. I will not understand some of it. (These two are different because in misunderstanding I think I understand but I am wrong. In simply not understanding the information I can not make heads or tails of that portion)
Later when I make use of what I may have learned these same effects will happen again to whatever it was I correctly understood.
Another, I as a natural intelligence know what I can quote, and what I should not due to copyrights, social mores, and law. AI regurgitates everything that might match regardless of source.
The third issue: The AI does not understand even with copious training data. It does not know that dogs bark, it does not have a concept of a dog.
I once wrote a more simple program that took a body of text and noted the third letter following each set of two, it built probability tables from the pair of letters + the next letter. After ingesting what little training information I was able to give it it would choose two letters at random and then generate the following letter using the statistics it had learned. It had no concept of words, much less the meaning of any words it might form.
I read an article about a subject. I will forget some of it. I will misunderstand some of it. I will not understand some of it. (These two are different because in misunderstanding I think I understand but I am wrong. In simply not understanding the information I can not make heads or tails of that portion)
Just because you’re worse at comprehension or have worse memory doesn’t make you any more real. And AIs also “forget” things, they also get stuff imperfectly, because they don’t store any actual “full length texts” or anything. It’s just separete words (more or less) and the likelyhood of what should come next.
Another, I as a natural intelligence know what I can quote, and what I should not due to copyrights, social mores, and law. AI regurgitates everything that might match regardless of source.
Except you don’t not perfectly. You can be absolutely sure that you often say something someone else has said or written, which means they technically have a copyright to it… But noone cares for the most part.
And it goes the other way too - you can quote something imperfectly.
Both actually can/do happen already with AIs, though it would be great if we could train them with proper attribution - at least for the clear cut cases.
The third issue: The AI does not understand even with copious training data. It does not know that dogs bark, it does not have a concept of a dog.
A sufficiently advanced artificial intelligence would be indistinguishible from natural intelligence. What sets them apart then?
You can look at animals, too. They also have intelligence, and yet there are many concepts that are incomprehensible to them.
The thing is though, how can you actually tell that you don’t work the exact same way? Sure the AI is more primitive, has less inputs - text only, no other outside stimuli - but the basis isn’t all that different.
When creating art do you get to make rules about who or what experiences it? Or is that a selfish asshole take?
Paint a picture but only some ppl get to see it. Sing a song but only some get to hear it.
What planet do you live on where those things are true?
Well, that’s the question at hand. Who? Definitely not, people have an innate right to think about what they observe, whether that thing was made by someone else, or not.
What? I’d argue that’s a much different question.
Let’s take an extreme case. Entertainment industry producers tried to write language into the SAG-AFTRA contract that said that, if an extra is hired for a production, they can use that extra’s image – including 3D spatial body scans – in perpetuity, for any purpose, and that privilege of eternal image storage and re-use was included in the price of hiring an extra for 1 day of work.
The producers would make precisely the same argument you are – how dare you tell them how they can use the images that they captured, even if it’s to use and re-use a person’s image and shape in visual media, forever. The actors argue that their physiognomy is part of their brand and copyright, and using their image without their express permission (and, should they require it, compensation) is a violation of their rights.
Or, I could just take pictures of somebody in public places without their consent and feed them into an AI to create pictures of the subject flashing children. They were my pictures, taken by me, and how dare anybody get to make rules about who or what experiences them, right?
The fact is, we have rules about the capture and re-use of created works that have applied to society for a very long time. I don’t think we should give copyright holders eternal locks on their work, but neither is it clear that a 100% free use policy on created work is the right answer. It is reasonable to propose something in between.
What is not a different question. As a creator you don’t get to say what or who can ingest your creation. If you did Google image search wouldn’t exist.
The thing you’re failing to realize is that this isn’t the first time a computer has been used to ingest info. The rules you assert have never been true to this point. Crawlers have been scanning web pages and images since the dawn of the Internet.
You act like this just started happening so now you get to put rules on what gets to look at that image. Too late there’s decades of precedent.