Wikipedia has a new initiative called WikiProject AI Cleanup. It is a task force of volunteers currently combing through Wikipedia articles, editing or removing false information that appears to have been posted by people using generative AI.

Ilyas Lebleu, a founding member of the cleanup crew, told 404 Media that the crisis began when Wikipedia editors and users began seeing passages that were unmistakably written by a chatbot of some kind.

237 points

Further proof that humanity neither deserves nor is capable of having nice things.

Who would set up an AI bot to shit all over the one remaining useful thing on the Internet, and why?

I’m sure the answer is either ‘for the lulz’ or ‘late-stage capitalism’, but still: historically humans aren’t usually burning down libraries on purpose.

permalink
report
reply
115 points

State actors could be interested in doing that. Same with the internet archive attacks.

permalink
report
parent
reply
97 points

historically humans aren’t usually burning down libraries on purpose.

How on earth have you come to this conclusion.

permalink
report
parent
reply
35 points

To be fair, it’s usually to effect cultural genocide. It’s not average people burning libraries, it’s usually some kind of authoritarian regime.

permalink
report
parent
reply
33 points
*

* looks around and gestures broadly in agreement*

permalink
report
parent
reply
18 points

Florida says hello. A bunch of other places too, sadly:-(.

permalink
report
parent
reply
13 points

historically humans aren’t usually burning down libraries on purpose.

Sometimes they are, Baghdad springs to mind, I’m sure there are other examples. And this library is online so there’s less chance of getting caught with a can of petrol and a box of matches.

Then there’s every authoritarian regime that tries to ban or burn specific types of books. What we’re seeing here could be more like that - an attempt to muddy the waters or introduce misinformation on certain topics.

permalink
report
parent
reply
9 points

Because basement losers can’t conquer and raze libraries to the ground.

The internet has shown that assumed anonymity result in people fucking with other people’s lives for the hell of it. Viruses, trolling, etc. This is just the next stage of it because of a new easy to use tool.

permalink
report
parent
reply
4 points

Yeah but the other thing about humanity is it’s mostly harmless. Edits can be reverted, articles can be locked. Wikipedia will be fine.

permalink
report
parent
reply
14 points
*

Edits can be reverted, articles can be locked.

Sure, but the vandalism has to be identified first. And that takes time and effort.

permalink
report
parent
reply
-3 points
*

Wikipedia relies on sources, and humans choosing the sources like newspapers. And those newspapers are more and more inside a “bubble” that rejects any evidence or reporting presented by a competing bubble.

Right now wikipedia is covering up one of the greatest acts of mass murder of our times, because the newspapers are covering it up, or rejecting evidence because it’s by the “enemy”. Part of this is a defensive posture against AI bots and enemy disinformation.

permalink
report
parent
reply
4 points

It’s not about on purpose but usually most people don’t care about what’s not in their interest. Today interests are usually quite shallow what tiktok shows quite well. Libraries do require money for operating. Even internet archive and wikipedia

permalink
report
parent
reply
3 points

People like this

permalink
report
parent
reply
1 point
*

Maybe a strange way of activism that is trying to poison new AI models 🤔

Which would not work, since all tech giants have already archived preAI internet

permalink
report
parent
reply
8 points

Ah, so the AI version of the chewbacca defense.

I have to wonder if intentionally shitting on LLMs with plausible nonsense is effective.

Like, you watch for certain user agents and change what data you actually send the bot vs what a real human might see.

permalink
report
parent
reply
2 points

I suspect it would be difficult to generate enough data to intentionally change a dataset. There are certainly little holes, like the glue pizza thing, but finding and exploiting them would be difficult and noticing you and blocking you as a data source would be easy.

permalink
report
parent
reply
0 points

I never told that I think it is smart…

permalink
report
parent
reply
1 point

Its because there’s no accountability for cybercrimes. If humans always had a button to burn down libraries, I’m sure they would have. Instead they had to put themselves in harms way to do such things.

People do things cause they can, and fucking with Wikipedia is apparently simple.

permalink
report
parent
reply
114 points

As for why this is happening, the cleanup crew thinks there are three primary reasons.

“[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,”

That last one. Ouch.

permalink
report
reply
47 points

The vast majority of people think they’re the good guys…

permalink
report
parent
reply
8 points

Are we the baddies?

permalink
report
parent
reply
8 points

Without knowing you: probably.

permalink
report
parent
reply
1 point

Russian agriculture is in dire need of mechanization.

permalink
report
parent
reply

“[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,

I think the main driver behind people misinformed about AI content comes from the fact that outside of tech people, most have no idea that AI will:

  1. 100% make up answers to things it doesn’t know because either the sample size of data they have ingested was to small or was bad. And it will do this with the same robot confidence you get for any other answer.

  2. AI that has been fed to much other AI generated content will begin to “hallucinate” and give some wild outputs, very similar to humans suffering from schizophrenia. And again these answers will be given as “fact” with the same robotic confidence.

permalink
report
parent
reply
3 points

And then #2 will be copied by other people and AIs, becoming seen as fact.

permalink
report
parent
reply
0 points

And then #2 will be copied by other people and AIs, becoming seen as fact.

permalink
report
parent
reply
6 points

Well, I was in doubt, so I asked the AI whether I could trust the answers and it told me not to worry about it. That must mean that I only get accurate answers, right? /s

permalink
report
parent
reply
69 points

Unleashing generative AI on the world was basically the information equivalent of jumping headfirst into Kessler Syndrome.

permalink
report
reply
44 points

For the uninitiated like me:

The Kessler syndrome (also called the Kessler effect,[1][2] collisional cascading, or ablation cascade), proposed by NASA scientists Donald J. Kessler and Burton G. Cour-Palais in 1978, is a scenario in which the density of objects in low Earth orbit (LEO) due to space pollution is numerous enough that collisions between objects could cause a cascade in which each collision generates space debris that increases the likelihood of further collisions.

Wikipedia link.

permalink
report
parent
reply
19 points

Good call, thank you.

Also: Referencing Wikipedia in this context is kinda funny.

permalink
report
parent
reply
11 points

I did think that. :) It’s just… So good. I hope it never enshitifies. God help us.

permalink
report
parent
reply
52 points

Best case is that the model used to generate this content was originally trained by data from Wikipedia so it “just” generates a worse, hallucinated “variant” of the original information. Goes to show how stupid this idea is.

Imagine this in a loop: AI trained by Wikipedia that then alters content on Wikipedia, which in turn gets picked up by the next model trained. It would just get worse and worse, similar to how converting the same video over and over again yields continuously worse results.

permalink
report
reply
24 points

See also: model collapse

(Which is more or less just regression towards the mean with more steps)

permalink
report
parent
reply
15 points

Yes, this is what many of us worry will become the internet in general. AI content generated on from AI trained on AI garbage.

AI bots can trivially outpace humans.

permalink
report
parent
reply
11 points

I was just discussing with a friend of mine how we’re rapidly approaching the dead internet. At some point, many websites will likely just be chat bots talking to other chat bots, which then gets used to train further chat bots. Human made content is already becoming harder and harder to find on algorithm heavy websites like Reddit and facebooks suite of sites. The bots can easily outpace any algorithmic changes they might make to help deter them, but my fb using family members all constantly block those weird Jesus accounts and they still show up constantly

permalink
report
parent
reply
7 points

Eventually every article just reads “Delve delve delve delve delve delve delve.”

permalink
report
parent
reply
6 points

A very similar situation to that analysed in this paper that was recently published. The quality of what is generated degrades significantly.

Although they mostly investigate replacing the data with ai generated data in each step, so I doubt the effect will be as pronounced in practice. Human writing will still be included and even curation of ai generated text by people can skew the distribution of the training data (as the process by these editors would inevitably do, as reasonable text could get through the cracks.)

permalink
report
parent
reply
2 points
*

AI model makers are very well aware of this and there is a move from ingesting everything to curating datasets more aggressively. Data prep is something many upstarts have no idea is critical, but everyone is learning about, sometimes the hard way.

permalink
report
parent
reply
3 points

Every article would end up being the philosophy page.

permalink
report
parent
reply
43 points

Jesus Christ. The amount of absolute bellends in the world never ceases to confound me.

permalink
report
reply
33 points

They used to be contained, every village has their idiot. Now that the internet is the global village, all the formerly isolated idiots have a place to chat.

permalink
report
parent
reply
7 points

Amazing how these idiots are this effective…

While us common folk can’t organize or agree on anything

permalink
report
parent
reply
6 points

Most of us do something idiotic once and when the opportunity to do it again, pull back and think "this was embarrassing last time, maybe I’ll re-evaluate. "

But a dedicated idiotic is a different beast, fill of confidence and have had what ever organ produces shame surgically removed enabling them to commit ever greater acts of idiocy. But then the internet was invented and these people met. Some even had babies. And now there is arms race to see how many idiots can squeeze through the same tiny door. They have recognised their time to shine and seized it with their clammy yet also sticky hands.

Truly, it’s inspiring in its own special way

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 16K

    Monthly active users

  • 12K

    Posts

  • 552K

    Comments