As I was browsing lemmy and the fediverse at large, this question kept popping into my head.

Since multimedia files have a much bigger footprint than raw text, it made me feel worried since as time goes, massive resources will be needed to keep up with the big data coming in.

I do wonder if the instances have taken the route of the cloud and just decided to put all of it in something like AWS S3? Or maybe they use self hosted storage with something like minio for object storage?

You are viewing a single thread.
View all comments
48 points
*

This will differ greatly from instance to instance. The people running lemmy.world have published some info on their infrastructure. My instance is running on a rather small VPS with 100GB storage, but I will have to rethink my solution rather soon as images and videos from my subbed communities [Edit: which are stored on outside sites] are eating around a gigabyte per day and I think this is likely to increase.

Edit: I want to clarify that I was partially wrong - Lemmy only locally caches content which is hosted on outside sites (e.g. imgur). It does not cache content that was directly uploaded to another Lemmy instance and just embeds the source media.

permalink
report
reply
6 points

Maybe have users use an outside image provider, like imgchest or gfycat or whatever?

permalink
report
parent
reply
6 points
*

This is the best idea in my opinion. Even on Reddit people used Imgur a bunch. It would be cool if either Lenny itself, or the various apps that have sprung up tied into an API for Imgur or similar to store pictures/videos, then it would be seamless to the end user.

EDIT: looks like Memmy on iOS actually already does this!

permalink
report
parent
reply
3 points

I was wrong about what gets cached: media that is hosted directly on remote instances is not cached, while media from outside sources (imgur etc.) is cached and served from that cache.

So, from a small instance’s point of view, the best case scenario would be if everyone used Lemmy’s own media hosting exclusively. But that would, of course, greatly increase the storage requirements of larger instances.

permalink
report
parent
reply
10 points

Gfycat is shutting down, sadly. There’s no money to be had in hosting pictures and videos for other sites that are viewed without ads. We already saw the Imgur clamp down a month or so ago. If these instances can’t self host the content it’s all going to have an invisible expiration date.

permalink
report
parent
reply
3 points

The issue is that if they shut down or change policies we get bit rot. Idk though it’s starting to sound pretty good.

permalink
report
parent
reply
14 points

Thank you for your work for the community! I think with more people using lemmy, we should also as users lookout for the infra we are using because the admins are not a mega corporation ready to spin up infinite resources.

permalink
report
parent
reply
14 points

No need to thank me, currently I am the only non-bot-user of my instance and do not allow registrations 😅

Many of the bigger instances have links to donate to their operators, but I am doubtful that relying solely on donations will be enough in the long run.

permalink
report
parent
reply
2 points

Since you’re the only one, you might consider setting an expiration on the media so your local storage serves as more of a cache. Like, I’m sure you’re far more likely to revisit a recent thread than a super old one, and as long as the original instance is still around you could redownload the media. This might require software patches though idk

permalink
report
parent
reply

Is it caching the entire image from the post or just the thumbnail?

permalink
report
parent
reply
4 points
*

Everything. It does some re-encoding when it retrieves content from other instances and you can set limits for pictrs (the software Lemmy uses to host media) regarding file sizes etc.

Edit: I was partially wrong about what is cached, see my original comment

permalink
report
parent
reply

I need to look more into pictrs and what it can do. Is this done on purpose for image redundancy? I get the reason if the original instance goes offline then I’d still have a copy but maybe I don’t really want a copy? Also would be nice if I could get it to convert everything to webp

permalink
report
parent
reply
2 points

When I was looking into hosting my own instance I thought I saw an option to disable media file replication entirely so that they would always have to be fetched from their home instance.

permalink
report
parent
reply
3 points

Honestly you may benefit from a cloud system that can put old images into cold storage when they go unaccessed for more than a few days

permalink
report
parent
reply

It was my understanding that images stayed on the source server and only thumbnails were “federated”.

permalink
report
parent
reply
4 points
*

I was partially wrong about what is cached and updated my comment.

permalink
report
parent
reply

Thanks for the update. I was concerned. Tho I still need to figure out how to manage that storage.

permalink
report
parent
reply

Asklemmy

!asklemmy@lemmy.ml

Create post

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it’s welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

Icon by @Double_A@discuss.tchncs.de

Community stats

  • 8.6K

    Monthly active users

  • 5.5K

    Posts

  • 302K

    Comments