How will lemmy scale?

posted 1 year ago

I already get rate-limited like crazy on lemmy and there are only like 60,000 users on my instance. Is each instance really just one server or are there multiple containers running across several hosts? I’m concerned that federation will mean an inconsistent user experience. Some instances many be beefy, others will be under resourced… so the average person might think Lemmy overall is slow or error-prone.

Reddit has millions of users. How the hell is this going to scale? Does anyone have any information about Lemmy’s DB and architecture?

I found this post about Reddit’s DB from 2012. Not sure if Lemmy has a similar approach to ensure speed and reliability as the user base and traffic grows.

https://kevin.burke.dev/kevin/reddits-database-has-two-tables/

Sort:

Hot Top Controversial New Old

[ - ]

marsara9@lemmy.world

2 points

1 year ago

Lemmy is entirely open source, so you can see what their architecture looks like, etc… here: https://github.com/LemmyNet/lemmy.

Rate limits, as I understand them from the code, should only apply on a per-IP basis. So you should only be seeing rate limit errors if:

your behind a CGNAT and multiple people who use your ISP are using Lemmy
you’re sending A LOT of requests to your instance yourself
the admin of your instance has significantly lowered the rate limits (viewable here: /api/v3/site)

permalink

report

[ - ]

radix@lemm.ee

1 point

1 year ago

I’m not an expert, but I thought the issue was generally that big instances like lemmy.world were getting overloaded on the server side, not that they were enforcing a manufactured rate limit on individual IPs.

Also, someone else mentioned that on the fediverse even simple things like an upvote are slower and require more work here than in centralized platforms because they must be sent to all the instances that are indexing that user/community. As I understand, that’s inherent to the fediverse, a bug not a feature, designed for redundancy and resilience.

Again uninformed, but Lemmy seems like it should scale fine. Bigger instances will monetize, driving prospective users to smaller instances, and then rate limiting and server lag won’t be so bad anymore.

permalink

report

parent

[ - ]

marsara9@lemmy.world

1 point

1 year ago

I’m not an expert, but I thought the issue was generally that big instances like lemmy.world were getting overloaded on the server side, not that they were enforcing a manufactured rate limit on individual IPs.

From what I can see it’s both. lemmy.world and others are getting overloaded, but there is an inherit built-in rate-limit in the code itself. You can see what those limits are via the api/v3/site. Now in theory if you’re actually getting rate-limited you should be seeing HTTP 429 responses from the server. If the server is just overloaded, you’ll get a 5xx response, the request will just timeout or at best you’ll still get a response but after a significant delay (what most people are seeing).

Also, someone else mentioned that on the fediverse even simple things like an upvote are slower and require more work here than in centralized platforms because they must be sent to all the instances that are indexing that user/community. As I understand, that’s inherent to the fediverse, a bug not a feature, designed for redundancy and resilience.

I don’t want to comment on this too much as I’m not an expert here, but here’s how federation / ActivityPub works from what I understand looking at the code:

Whenever you take any action (or activity) your browser will first send that message to your instance. If your instance then owns the community that message is then propagated out to EVERY linked instance listed here: /instances / api/v3/federated_instances. If your instance doesn’t own the community, that message is forwarded off to the instance that does and they sent it out to EVERYONE on their federated instances list. As you can see this creates A LOT of network traffic.

This posing an interesting problem… the number of ActivityPub messages goes up as the number of instances increase. But at the same time as more and more users join a single instance that require that that instance send more and more traffic to individual user’s browsers as they view and respond to posts. So the problem here is trying to find a good balance. And to top it off, the default behavior of most users is going to be to join the largest instances, making that instance incur more and more traffic to view content.

Again uniformed, but Lemmy seems like it should scale fine. Bigger instances will monetize, driving prospective users to smaller instances, and then rate limiting and server lag won’t be so bad anymore.

Will it though? How would an individual instance monetize? They would have to use donations. If an instance tries to add Ads, users will leave to an instance that doesn’t, making it so that they don’t get any income. They could charge a subscription fee, but again users would just leave and the admins get nothing.

The ideal configuration of the fediverse as I see it, is if we had two types of servers 1) content servers that only hosted communities but didn’t have any real number of users, and 2) user servers that have no communities but most of the users. This way the number of API requests between instances is rather limited. When you end up with a server that has both most of the content and the userbase, the workload of that server appears to grow exponentially instead of linearly as the number of new instances rises.

permalink

report

parent

[ - ]

iso@lemmy.com.tr

1 point

1 year ago

Since Mastodon is using ActivityPub too. How it worked after Elon?

permalink

report

[ - ]

Hexadecimalkink@lemmy.ml

3 points

1 year ago

Mastodon is a couple years older than Lemmy, it was already a fairly mature product. People are thinking Lemmy is badly coded but we’re all using a beta product with no user limits.

permalink

report

parent

[ - ]

Iteria@sh.itjust.works

7 points

1 year ago

Education probably. Back in the day people didn’t have any problem understanding that different forums had different capabilities. When MMOs were in full swing, people didn’t have problems understanding what being on thr popular server during peak hours meant.

Everyone has just gotten too used to centralization with a lot of money behind it. Eventually people will adjust their expectations. Even if Meta’s fediverse attempt takes off, there are always going to be niche communities that exist outside of those spheres, so if people want that, they’ll have to move.

The point of the fediverse is having a choice. Some people are going to chose megacorp of the week’s offering and that’s okay as long a little pockets exist for when people get mad at the megacorp. Also federation leaves space for multiple dominant platforms in a way the current system doesn’t.

In short, eventually some instances are going to be bankrolled either through a robust crowdcourcing effort or through being a company. That’s okay. The purpose of the fediverse is to allow for smaller niche ideas to be able to breathe without having to adhere to one group’s ideals. “If you don’t like it, make your own” is a fair statement now

permalink

report

[ - ]

SlovenianSocket@lemmy.ca

11 points

1 year ago

As far as I’m aware lemmy does not support load balancing or high availability as it currently strands. But development is still in its infancy and I’m sure that’s a top priority

permalink

report

[ - ]

Max-P@lemmy.max-p.me

1 point

1 year ago

I don’t see any reasons why you couldn’t run more copies of the backend and frontend, as long as it uses the database properly. It should scale horizontally decently for a while.

At work I have clusters that runs 40-50 application servers all going to one database and handles millions of requests daily, on a pretty inefficient PHP application. Lemmy being in Rust, it can handle a lot of traffic.

Given the frontend is in nodejs, I suspect we’ll need to scale up the frontend first, which should be no problem at all, just many copies of the frontend to fewer copies of the backend to fewer copies of the database. Maybe slap Cloudflare in front at some point.

It will probably get costly to run before it becomes hard technically to scale up.

permalink

report

parent

[ - ]

notavote@sh.itjust.works

1 point

1 year ago

Additionally, federation messages probably can be easily separated to different server, and is being made much more efficient right now.

permalink

report

parent

[ - ]

Freeman@lemmy.pub

3 points

1 year ago

I mean it can….it’s just very DB heavy. It would be on an admin to scale up and scale out a single instance witb multiple dbs, replication etc.

It would be nice to be able to assign dbs to a task (ie: one for federation updates, one for local community posts, one to service web requests. There may be a way to do that already but I’m not aware, it may need to be in code.

Also syncing/federation across instances seems to be a mixed bag. And my instance will sometimes waste threads trying to sync with instances they have come and gone. As a result some communities id love to see updates on don’t come through.

Ideally they figure a way to continue to optimize federation and allow smaller instances to just pick up the load.

Mine is open, but I’m not getting any registration requests. I’m not upset about it but their main join page still seems to optimize for larger instances. It would make more sense to optimize for smaller ones to better distribute load. And focus dev work on better l/smoother syncing between federated instances.

Some locking down is a concern. I would love to see a lemmy of trust group if that came to pass. Where you can join the group and federate. My biggest concern with open federation is the legal risk of things like CSAM or CP getting synced onto your instances even if you have the nsfw box unchecked.

permalink

report

parent

[ - ]

AggressivelyPassive@feddit.de

1 point

1 year ago

It’s not only about scaling a single instance, but scaling the fediverse.

Currently, each instance sends all events to all federated instances. That means, essentially each instance needs to store and process a significant part of the entire fediverse. That’s insane and has to be addressed.

permalink

report

parent

[ - ]

ollie@lemmy.world

2 points

1 year ago

I don’t think it officially supports it but it does work! Lemmy.world is currently running on multiple containers load balanced by nginx. look at u/ruud latest post about it

permalink

report

parent

Asklemmy

!asklemmy@lemmy.ml

Create post

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it’s welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
!lemmy411@lemmy.ca: a community for finding communities

_Icon _by _{@Double_A@discuss.tchncs.de}

Community stats

9.6K
Monthly active users
4.9K
Posts
276K
Comments

Community stats

Community moderators