Wikipedia is under assault: rogue users keep posting AI generated nonsense(www.techspot.com)

posted 2 months ago

ForgottenFlux@lemmy.world

technology@lemmy.world

91 commentshide report

Wikipedia has a new initiative called WikiProject AI Cleanup. It is a task force of volunteers currently combing through Wikipedia articles, editing or removing false information that appears to have been posted by people using generative AI.

Ilyas Lebleu, a founding member of the cleanup crew, told 404 Media that the crisis began when Wikipedia editors and users began seeing passages that were unmistakably written by a chatbot of some kind.

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments

[ - ]

narc0tic_bird@lemm.ee

52 points

1 month ago

Best case is that the model used to generate this content was originally trained by data from Wikipedia so it “just” generates a worse, hallucinated “variant” of the original information. Goes to show how stupid this idea is.

Imagine this in a loop: AI trained by Wikipedia that then alters content on Wikipedia, which in turn gets picked up by the next model trained. It would just get worse and worse, similar to how converting the same video over and over again yields continuously worse results.

permalink

report

[ - ]

huginn@feddit.it

24 points

1 month ago

See also: model collapse

(Which is more or less just regression towards the mean with more steps)

permalink

report

parent

[ - ]

Wrench@lemmy.world

15 points

1 month ago

Yes, this is what many of us worry will become the internet in general. AI content generated on from AI trained on AI garbage.

AI bots can trivially outpace humans.

permalink

report

parent

[ - ]

kboy101222@sh.itjust.works

11 points

1 month ago

I was just discussing with a friend of mine how we’re rapidly approaching the dead internet. At some point, many websites will likely just be chat bots talking to other chat bots, which then gets used to train further chat bots. Human made content is already becoming harder and harder to find on algorithm heavy websites like Reddit and facebooks suite of sites. The bots can easily outpace any algorithmic changes they might make to help deter them, but my fb using family members all constantly block those weird Jesus accounts and they still show up constantly

permalink

report

parent

[ - ]

Captain Aggravated@sh.itjust.works

7 points

1 month ago

Eventually every article just reads “Delve delve delve delve delve delve delve.”

permalink

report

parent

[ - ]

8uurg@lemmy.world

6 points

1 month ago

A very similar situation to that analysed in this paper that was recently published. The quality of what is generated degrades significantly.

Although they mostly investigate replacing the data with ai generated data in each step, so I doubt the effect will be as pronounced in practice. Human writing will still be included and even curation of ai generated text by people can skew the distribution of the training data (as the process by these editors would inevitably do, as reasonable text could get through the cracks.)

permalink

report

parent

[ - ]

Blaster M@lemmy.world

2 points

1 month ago

AI model makers are very well aware of this and there is a move from ingesting everything to curating datasets more aggressively. Data prep is something many upstarts have no idea is critical, but everyone is learning about, sometimes the hard way.

permalink

report

parent

[ - ]

Zorque@lemmy.world

3 points

1 month ago

Every article would end up being the philosophy page.

permalink

report

parent

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

Community stats

16K
Monthly active users
12K
Posts
552K
Comments

Our Rules

Approved Bots

Community stats

Community moderators