cross-posted from: https://feddit.org/post/2958203

There is an interesting study (May 2024), also linked in the article: When Online Content Disappears

Historians of the future may struggle to understand fully how we lived our lives in the early 21st Century. That’s because of a potentially history-deleting combination of how we live our lives digitally – and a paucity of official efforts to archive the world’s information as it’s produced these days.

However, an informal group of organisations are pushing back against the forces of digital entropy – many of them operated by volunteers with little institutional support. None is more synonymous with the fight to save the web than the Internet Archive, an American non-profit based in San Francisco, started in 1996 as a passion project by internet pioneer Brewster Kahl. The organisation has embarked what may be the most ambitious digital archiving project of all time, gathering 866 billion web pages, 44 million books, 10.6 million videos of films and television programmes and more. Housed in a handful of data centres scattered across the world, the collections of the Internet Archive and a few similar groups are the only things standing in the way of digital oblivion.

“The risks are manifold. Not just that technology may fail, but that certainly happens. But more important, that institutions fail, or companies go out of business. News organisations are gobbled up by other news organisations, or more and more frequently, they’re shut down,” says Mark Graham, director of the Internet Archive’s Wayback Machine, a tool that collects and stores snapshots of websites for posterity. There are numerous incentives to put content online, he says, but there’s little pushing companies to maintain it over the long term.

Despite the Internet Archive’s achievements thus far, the organisation and others like it face financial threats, technical challenges, cyberattacks and legal battles from businesses who dislike the idea of freely available copies of their intellectual property. And as recent court losses show, the project of saving the internet could be just as fleeting as the content it’s trying to protect.

“More and more of our intellectual endeavours, more of our entertainment, more of our news, and more of our conversations exist only in a digital environment,” Graham says. “That environment is inherently fragile.”

58 points

No

Not when they’re bound by the DMCA

permalink
report
reply
-15 points

Never heard of the snapshot feature have ya?

I’ve got terabytes of things archived, still there.

permalink
report
parent
reply
23 points

They can all be DMCAd and gone by tomorrow.

permalink
report
parent
reply
-14 points

I don’t even have an account silly. Think I’m stupid enough to do that?

permalink
report
parent
reply
16 points
*

I wonder for much of that 25% vanished are pages that are “retail”…. Although I guess it’s like having an old sears catalog around to document prices.(?). My old Tripod pages from my high school reunion in 2001 are still alive!

permalink
report
reply
4 points

For real? Haha that’s awesome. My angelfire ones are all gone

permalink
report
parent
reply
2 points

Yes!!! I thought about it the other week and searched, they are all still there!

permalink
report
parent
reply
3 points

Wish there was an archival program with like government level funding or something thats immune to DMCA, because everythings being lost so fast. History is very important!

permalink
report
reply
-23 points

Why should we care? Never in our history have we chronicled all the mundane things like we find on the internet. It’s like saying we need to archive all the kids drawings or all the personal journals or record all in person conversations and archive them… Just because it’s digital data that can be archived if we feel like it doesn’t mean it’s important.

permalink
report
reply
33 points
*

Not enough of the mundane has been preserved throughout human history, it continues to be a big problem for historians. Especially when they only have major - likely very coloured or outright lies - official records of events and cultural touchstones to go on.

Why do you think we get so incredibly excited when we uncover something as mundane as the pricing artwork on an ancient Roman food stall? Because that stuff wasn’t preserved, nobody bothered to record such details, so much is lost because nobody thinks their place in history matters enough to bother saving it.

We’ve reached a point in our development where we now have the ability to preserve snapshots of our civilisation in great detail, with extreme ease. We owe it to ourselves and especially to future generations to do so.

permalink
report
parent
reply
-1 points

That preservation comes at a cost though, both monetary and environmental, and the amount of data to preserve increases exponentially.

permalink
report
parent
reply
9 points

There are really great WriteOnceReadOnlyAfterMemory solutions that hold mind blowing amounts of data without the need of being powered 😁 like glass and laser for example.

permalink
report
parent
reply
2 points

This was an underlying plot point in Metal Gear Solid 2.

permalink
report
parent
reply
13 points

There’s a few things going on. At first blush, I agree with you. The vast majority of that stuff doesn’t need to be captured.

But if you don’t capture everything, how do you know you got the stuff that will be important or wanted in the future?

Also, historians are going to find that data to be an absolute gold mine. Unfortunately, a lot of it is in the form of video now and takes a ton of storage space.

I think, in the end, most people are not willing to pay the price to archive everything. But some are, and they’re doing it.

permalink
report
parent
reply
5 points

It reminds me of that Black Mirror episode where people have memory chips to allow them to relive everything they’ve ever lived.

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 17K

    Monthly active users

  • 12K

    Posts

  • 542K

    Comments