In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.

You are viewing a single thread.
View all comments View context
6 points

This is a tendency I’ve heard that I haven’t been able to understand. What is the new risk of expressing your thoughts, prose, or poetry online that didn’t exist before and currently exists with LLMs scraping them? How would the corporations exploit your work through data scraping that would demotivate you to express it at all? Because I know tone doesn’t come accross well in text, I want to clarify that these are genuine questions because my answers to these questions seem to be very different than many and I’d like to understand where that difference in perspective comes from.

permalink
report
parent
reply
5 points
*

I think this largely boils down to the time scales required. A person copying your work has a minimum amount of time it takes them to do that, even when it’s just copy and paste. An LLM can copy thousands of different developer’s code, for instance, and completely launder the license. That’s not ok. Why would we allow machines to commit fraud when we don’t allow people to?

permalink
report
parent
reply
2 points

This is very interesting for me to think about, since I have so many issues with proprietary technology in general. An LLM copying the code from thousands of proprietary projects is kind of an interesting loophole considering that it would be difficult for any of the individual businesses to prove that their proprietary code was infringed unless the LLM does copy and paste the code exactly. That could cause major changes in the tech industry which I’m not able to predict. Optimally I would like technological development more in the hands of people than behind legal barriers such as with Open Source code and I am not a programmer, so take my musings with a grain of salt.

permalink
report
parent
reply
2 points

Except that isn’t exactly how neural networks learn. They aren’t exactly copying work, they’re learning patterns in how humans make those works in order to imitate them. The legal argument these companies are making is that the results from using AI are transformative enough that they qualify as totally new and unique works, and it looks as if that might end up becoming law, depending on how the lawsuits currently going through the courts turn out.

To be clear, technically an LLM doesn’t copy any of the data, nor does it store any data from the works it learns from.

permalink
report
parent
reply
2 points
*
spoiler

asdfasdfsadfasfasdf

permalink
report
parent
reply
1 point

Except, what it produces is very similar or identical to some copyrighted works, licensed under the LGPL, like in this case. You don’t have to copy a whole program to plagiarize someone

permalink
report
parent
reply

Technology

!technology@beehaw.org

Create post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

Community stats

  • 3K

    Monthly active users

  • 2.8K

    Posts

  • 55K

    Comments