The recent downtime of the #InternetArchive reminded me

(a) How vital the site is for my own work. Fortunately, I save pretty much all old books I need for my work to my hard drive, so I am not totally lost without it - but still, most of the links to the individual folk tales I am translating go to online archives, and the Internet Archive is the most important among them.

(b) How storing all this vital cultural heritage stuff at one single site is a terrible idea. Today, the Internet Archive might be taken down by hackers. Tomorrow, the site might commit suicide by lawyers. And in a possible future, a fascist US government might take the site down out of sheer spite.

While there are a fair number of other, more specialized digital libraries out there, too many public domain works are only available at the Internet Archives. And another huge percentage is stored only at the Internet Archive and Google Books, which is not a lot better.

We need a more distributed archive system where all these works can stored on multiple servers around the world - yet where users can search through all of them with comparable ease. Only in this way will our digital cultural heritage be truly safe.

Perhaps a #Fediverse - based approach could work? Something like #Bookwyrm , but with actual data storage?

What do you think?

1 point

@juergen_hubert@thefolklore.cafe
There’s a concept called decentralized storage.
https://www.techtarget.com/searchstorage/tip/Comparing-4-decentralized-data-storage-offerings

permalink
report
reply
1 point

@juergen_hubert@thefolklore.cafe Storing is easy, the problem you face is organization: You will need to find the books you are looking for. That requires a search index, and probably some kind of tags. OpenLibrary has a thousands of volunteers for this and most of their entries are still rather bare-bones. I think this requires a more institutionalized approach than BookWyrm (which draws data from institutions like OpenLibrary and Worldcat).

permalink
report
reply
1 point

@belchion@rollenspiel.social Are there any good models for distributed search functions out there?

permalink
report
parent
reply
1 point

@juergen_hubert@thefolklore.cafe The only one I know is YaCy* https://yacy.net/, but I have never tried it.

Theoretically, you could build an engine with the Apache Lucene environment, and have its crawler component based on some kind of P2P networks, but that requires a whole lot of specialized technically expertise and probably cannot be done outside of major support infrastructure.

permalink
report
parent
reply
1 point
*

@juergen_hubert weren’t early music and film ‘sharing’ schemes based on distributed storage via torrents or such of files on individuals’ hard drives?

permalink
report
reply

@juergen_hubert@thefolklore.cafe HathiTrust has a fair number of public domain books

permalink
report
reply
1 point

@petes_bread_eqn_xls@mastodo.neoliber.al True, but they also have geographic restrictions on many of them.

permalink
report
parent
reply

Fediverse stuff

!fediverse@fedia.io

Create post

This is a magazine dedicated to posts about the Fediverse and things related to it. This is a MBin magazine, but you can follow it from Lemmy or Piefed as well. If you want to post specifically about Mbin feel free to post into !mbinmeta@gehirneimer.de

Recommended magazines

Rules

  1. Please stay on topic, if it’s off-topic please write [META] in the title. Please report off-topic posts if you see them.
  2. Try to avoid drama related posts, for this type of content you can go to !fediverselore@lemmy.ca

Community stats

  • 255

    Monthly active users

  • 360

    Posts

  • 1.2K

    Comments

Community moderators