Just started self hosting this instance. Nothing on the docs mentioned anything about storage considerations.
Holding onto all that data is pointless if you’re not selling it to someone.
I disagree. One big hunk of value of a place like this is being able to look back at old threads. How many times did people say they always put “Reddit” in front of their Google searches to get the information they were looking for? This could be the same.
That’s unsustainable. Why do you think the mainstream platforms are selling out?
Info is still useful for people doing google searches. It would be nice to be able to find common troubleshooting tips on Lemmy, etc.
Is there any way to purge old data?
I really hope it doesn’t get purged if lemmy is to be a Reddit replacement. A lot of the value Reddit had was obscure knowledge and making google searches actually usable.
I think as long as the original community the post is in doesn’t purge the data, it’s fine for other instances to purge if necessary.
Exactly, when dealing with big data, you need a strategy to archive old data. You can’t just store everything in one DB. Smaller instances may not feel like keeping all the date from all the time. Even big instances should have a mechanism to move old data do different databases.
This is lemmy.world after 4 weeks:
58G pictrs
34G postgres
Feels like this will benefit from some sort of fuzzy deduplication in the pictrs storage. I bet there are a lot of similar pics in there. E.g. if one pic or a gif is very similar to another, say just different quality or size, or compression, it should keep only one copy. It might already do this for the same files uploaded by different people as those can be compared trivially via hashing, but I doubt it does similarity based deduplication.
Considering this is going to be around a 5 user instance at most I think I’ll be good for awhile. Thanks!
im running 50 users right now, subbed to A LOT of communities, seeing db growth of about 100mb per day.
That seems high when you extrapolate that to 10000 users, like a larger instance might have.
Question if you know: does a lemmy instance have to be publically accessable to work? Like, if I make an instance on my homelab can the instance “fetch” content and serve it faster locally? Could I reply to a post and have others see it? Etc
Now I wonder how viable it would be to support video hosting. The answer is almost certainly “God no!”
I haven’t tried it out just yet, but I’d say… a whole lot. Depending of how popular your instance is, but… your PC will be hammered in any way.
How many cans-of-beans.jpg
can you store?