Maven Imported 1.12 Million Fediverse Posts

[ - ]

13 points

9 months ago

Oh shit, the persona guy was right! We should all be adding license to our comments, so could not legally train model that are then used for commercial purposes.

permalink

report

reply

[ - ]

onlinepersona@programming.dev

1 point

9 months ago

It’s especially for these kinds of dumb cases where they simply copy content wholesale and boast about it. With more people licencing their contents as non commercial, the “hot water” these companies get in could not just be trivial but actually legal.

Would be great if web and mobile clients supported signatures or a “licence” field from which signatures were generated. Even better would be if people smarter than me added a feature to poison AI training data. This could also be done by a signature or some other method.

Anti Commercial-AI license

permalink

report

parent

reply

[ - ]

TheGalacticVoid@lemm.ee

1 point

9 months ago

I don’t know; AFAIK, Reddit successfully argued that they own Wallstreetbets’ trademarks in court. That might void all of these licenses depending on the ToS of the instance being used.

permalink

report

parent

reply

[ - ]

Danterious@lemmy.dbzer0.com

4 points

9 months ago

yeah they were. I hope more people start doing it even if it doesn’t legally hold water its still a good way to show that fediverse users won’t stand for that.

_Anti _{Commercial-AI} _license _(CC _BY-NC-SA _4.0)

permalink

report

parent

reply

[ - ]

onlinepersona@programming.dev

1 point

9 months ago

Why do you think it won’t hold water legally? There’s a case going right now against Github Copilot for scraping GPL licences code, even spitting it back out verbatim, and not making “open” AI actually open.

Creative Commons is not a joke licence. It actually is used by artists, authors, and other creative types.

Imagine Maven or another company doing the same shit they just did and it coming to light there were a bunch of noncommercially licences content in there. The authors could band together for a class action lawsuit and sue their asses. Given the reaction of users here and on mastodon, I wouldn’t even be surprised if it did happen.

Anti Commercial-AI license

permalink

report

parent

reply

[ - ]

Danterious@lemmy.dbzer0.com

1 point

9 months ago

I mostly mention that to fend off the people that use the main basis of their argument as the effectiveness because that’s not why I’m doing it.

I do think it could work legally if the courts want to remain consistent, but that isn’t guaranteed.

_Anti _{Commercial-AI} _license _(CC _BY-NC-SA _4.0)

report

reply

[ - ]

0 points

9 months ago

Lol that shit don’t do shit

permalink

report

parent

reply

[ - ]

Blaze@reddthat.com

7 points

9 months ago

@onlinepersona@programming.dev

permalink

report

parent

reply

[ - ]

onlinepersona@programming.dev

7 points

9 months ago

*

Thanks for linking me 🙏 The makers of Maven probably set off a bomb now and people might ask for anti-AI features on the clients and servers.

Anti Commercial-AI license

permalink

report

parent

reply

[ - ]

Pennomi@lemmy.world

18 points

9 months ago

The easiest way is a sitewide NoAI meta tag, since it’s the current standard. Researchers are much more likely to respect a common standard and extremely unlikely to respect a single user’s personal solution adding a link to their comments.

permalink

report

parent

reply

[ - ]

iAvicenna@lemmy.world

4 points

9 months ago

I feel like the bad thing about this is, whereas the researchers will mostly respect this, companies who want to make money out of data will still secretly keep using the data anyways. I am more ok with the data being used for non-profit research and not for making money but this would likely have the opposite effect.

permalink

report

parent

reply

[ - ]

Pennomi@lemmy.world

1 point

9 months ago

If that’s truly the case, nothing on earth can protect your data.

That being said, large corporations are far more liable to consumer protection lawsuits, especially in areas like the EU.

permalink

report

parent

reply

Show more comments

[ - ]

Scrubbles@poptalk.scrubbles.tech

6 points

9 months ago

This is the only way I see it being acceptable. How do we add this to instances?

permalink

report

parent

reply

[ - ]

katy ✨@lemmy.blahaj.zone

-1 points

9 months ago

yeah but who posts to mastodon under public instead of unlisted/quiet public?

permalink

report

reply

[ - ]

Sean Tilley@lemmy.worldOP

8 points

9 months ago

Pretty much everybody.

permalink

report

parent

reply

[ - ]

misk@sopuli.xyz

29 points

9 months ago

*

That’s why I keep saying it’s pointless to defederate corpos. They’ll just scrape everything before you notice.

permalink

report

reply

[ - ]

Pennomi@lemmy.world

3 points

9 months ago

Plus even if you defederate them, oops, it’s all public anyway!

permalink

report

parent

reply

[ - ]

Blaze@reddthat.com

9 points

9 months ago

Defederation is more about not being flooded with 1000x more users than the Fediverse currently has

permalink

report

parent

reply

[ - ]

misk@sopuli.xyz

1 point

9 months ago

So far we only have a corpo fedi-twitter in form of Threads. In that case non-corpo instance user has to specifically follow someone before their content is federated so that sounds like a bit overblown issue.

permalink

report

parent

reply

[ - ]

Blaze@reddthat.com

1 point

9 months ago

Seems pretty easy for any corporation to setup something like https://lemmy-federate.com/ but for Maston/IceShrimp/Misskey accounts to federate the important corporate accounts to the targeted non-corpo instances

report

reply

[ - ]

1 point

9 months ago

Unfortunately a lot of people think it’s to do with scraping as well. The amount of “defederate Threads so that they can’t scrape my data” posts I saw was about 50-50 with the sensible takes.

permalink

report

parent

reply

[ - ]

zoey@lemmy.blahaj.zone

28 points

9 months ago

The fact they even got DMs from at least one instance is crazy.

permalink

report

parent

reply

[ - ]

GBU_28@lemm.ee

1 point

9 months ago

Well the problem is user perception/understanding.

The reality is they were literally direct messages, not private messages.

permalink

report

parent

reply

[ - ]

mke@lemmy.world

27 points

9 months ago

*

And it’s also damming for private messaging on mastodon.

I once read vague complaints about it being a rushed implementation. While I won’t trust those without evidence, I for sure wouldn’t trust mastodon with my PMs. At least, not until how this was allowed to happen is figured out and fixed if necessary.

P.S. I’m still not sure I believe in PMs in the fediverse. If I need to share something and care about keeping it private, I’d rather move the conversation elsewhere.

permalink

report

parent

reply

[ - ]

technomad@slrpnk.net

17 points

9 months ago

I was under the impression that DM’s on Mastodon (and Lemmy too) weren’t ever stated as being secure and I think that they were both pretty transparent about this particular aspect.

permalink

report

parent

reply

Show more comments

[ - ]

threelonmusketeers@sh.itjust.works

40 points

9 months ago

I was confused on what they were trying to accomplish, and even after reading the article I am still somewhat confused.

Instead, when a user posts something, the algorithm automatically reads the content and tags it with relevant interests so it shows up on those pages. Users can turn up the serendipity slider to branch out beyond their stated interests, and the algorithm running the platform connects users with related interests.

Perhaps I’m a minority, but I don’t see myself getting much utility out of this. I already know what my interests are, and don’t have much interest in growing them algorithmically. If a topic is really interesting, I’ll eventually find out about it via an actual human.

permalink

report

reply

[ - ]

DeprecatedCompatV2@programming.dev

0 points

9 months ago

*

So you don’t ever want to learn about new things? And even if you did, you wouldn’t want those new things be efficiently suggested to you and instead be bundled with a bunch of other boring crap?

Also, what you’re asking for is what the tool seems to do. You would put the slider all the way to one side to avoid having new stuff suggested. Existing social media platforms often just shove stuff at you endlessly.

permalink

report

parent

reply

[ - ]

Plopp@lemmy.world

1 point

9 months ago

Instead, when a user posts something, the algorithm automatically reads the content and tags it with relevant interests so it shows up on those pages.

Motherfucker this is what hashtags are for.

permalink

report

parent

reply

[ - ]

technomad@slrpnk.net

28 points

9 months ago

Yeah, we’re trying to get the fuck away from algorithms. That’s what makes the fediverse such a big draw currently, for me.

permalink

report

parent

reply

[ - ]

FaceDeer@fedia.io

3 points

9 months ago

You’re on slrpnk.net, I assume it’s not implementing any of this stuff. As long as you don’t sign up for Maven I don’t see how this is going to affect you.

permalink

report

parent

reply

[ - ]

technomad@slrpnk.net

7 points

9 months ago

*

I mean yeah, maybe it won’t affect me directly, I like the instance I’m on and it’s a pretty respectable one. However, indirectly, this is very relevant to any Fediverse user, regardless of the instance or platform they’re using. Allowing abuses like this to happen without any pushback is a surefire way of turning this place into a shithole just like the rest of the internet. I appreciate the fact that, at least for now, it’s different here.

Also, maybe this isn’t my only homebase? Just saying.

permalink

report

parent

reply

[ - ]

Scrubbles@poptalk.scrubbles.tech

13 points

9 months ago

Only algorithm I need is posts I subscribe to, in descending order. That’s about it

permalink

report