Just started self hosting this instance. Nothing on the docs mentioned anything about storage considerations.
This is lemmy.world after 4 weeks:
58G pictrs
34G postgres
Considering this is going to be around a 5 user instance at most I think I’ll be good for awhile. Thanks!
im running 50 users right now, subbed to A LOT of communities, seeing db growth of about 100mb per day.
That seems high when you extrapolate that to 10000 users, like a larger instance might have.
Question if you know: does a lemmy instance have to be publically accessable to work? Like, if I make an instance on my homelab can the instance “fetch” content and serve it faster locally? Could I reply to a post and have others see it? Etc
Now I wonder how viable it would be to support video hosting. The answer is almost certainly “God no!”
Feels like this will benefit from some sort of fuzzy deduplication in the pictrs storage. I bet there are a lot of similar pics in there. E.g. if one pic or a gif is very similar to another, say just different quality or size, or compression, it should keep only one copy. It might already do this for the same files uploaded by different people as those can be compared trivially via hashing, but I doubt it does similarity based deduplication.
Depends. If you have a lot of users posting a lot of pictures and you use pictrs out of the box config, then a lot. If you are just running a few users with finite communities being synced then a lot less. The number is going to vary a lot as lemmy grows and gets older so hard to document realistic expectations. But docker images are probably going to take up more disk space than actual contents unless you get quite big. I just threw my PG volume into a tgz to move servers and it’s less than a gig.
476M ./postgres 1.1G ./pictrs
After 3 weeks
This is my small instance with way fewer users than lemmy.world.
11G pictrs
5.2G postgres
Out of curiosity, how long has your instance been up? Just want to get a sense of how fast storage is increasing for you.
How has your Lemmy experience been on a self hosted instance? I’m currently using lemmy.world and it’s very error prone, would self hosting reduce those errors at the expense of anything? Does federation take long or do you find you’re getting federated content quickly enough?
The experience has been pretty good, to be honest. No instability, easy updates, etc. I find federated content quite quickly, because I use this script to populate the “All” feed.
Is there any way to purge old data?
I really hope it doesn’t get purged if lemmy is to be a Reddit replacement. A lot of the value Reddit had was obscure knowledge and making google searches actually usable.
I think as long as the original community the post is in doesn’t purge the data, it’s fine for other instances to purge if necessary.
Exactly, when dealing with big data, you need a strategy to archive old data. You can’t just store everything in one DB. Smaller instances may not feel like keeping all the date from all the time. Even big instances should have a mechanism to move old data do different databases.