most people i know use google by searching whatever question they have and including the word “reddit” at the end to find reddit threads since it currently has the most useful information.
As Lemmy gets more and more filled with useful threads and reviews it would be great if we can collectively improve Lemmy’s SEO so just including the word lemmy in a search will show lemmy threads related to the search.
The obscure tlds used in lemmy servers don’t help and lemmy.com currently redirects to lemm.ee. Is there a way we can improve the SEO of all instances or have lemmy.com be a aggregator of threads from many Lemmy servers?
This is going to present a formidable challenge because Lemmy is so decentralized. I mean I am all for it but wouldn’t even begin to conceptualize how to get started.
Not really… Nearly every phpBB, SMF, Invision forum etc is a separate self hosted community and they get indexed. Hell even small ones like one of my gaming guilds ranked high for specific key words.
For SEO to work well you need clear links to clear topics, and it helps to have automatic metada also to assist indexing.
Having google compatible site maps assists in crawling the site.
They can all get crawled, but you can’t just add lemmy and force results from the fediverse to the top.
I think once there is enough info on lemmy it’ll just get to the point where the searches will bring up the information you need.
I feel like if I’m searching on Google I want it to take me to the most relevant source of information even if that still is reddit. It won’t always be. But it is still right now.
Nothing wrong with competition, it’ll give better search results.
I think to some extent this is starting to work. I googled a software question the other day and found a lemmy answer within the first couple of results! Definitely better than a couple months ago where even searching for “lemmy” didn’t bring up lemmy among the top few.
Keep in mind how Google works. It’s taking your metrics into consideration. So someone who has never Googled lemmy likely wouldn’t have seen thst same result.
Lemmy and other federated systems being spread out across individual instances does make things more difficult. Normally you could just do “site:reddit.com” and automatically filter all results to be from Reddit. But you can’t do that because your result could be on any one of hundreds of instances, many of which do not have “lemmy” in their title.
It appear reasonably large instances like lemmy.one have been indexed, i got results using site:lemmy.one. and for those larger instances, they should still be able to index federated content.
Maybe I’m misinterpreting what you’re saying, but lemmy.one has basically no content on it other than the 8 communities that @jonah has created or allowed there. The whole point of that server is to allow people to simply login and then participate in other instances from there.
That is all to say, lemmy.one would be one of the “smaller” instances from a standpoint of content to be indexed by Google.
The whole point of that server is to allow people to simply login and then participate in other instances from there.
In order for users on lemmy.one to interact with content on other instances, lemmy.one has to import and host that content. So, it has plenty of content on it, just most of it originated elsewhere. That remote content should be just as indexable as local content.
I think it would be best if each post had a canonical tag pointed at the originating server’s version of that post. The lemmy ui generates a canonical tag now but I’m not sure it doesn’t just point to itself.
I didn’t quite realize that, figured that when you were viewing another instances’ content it was loading from that instance. I guess that means that Lemmy content across all instances has loads of redundant copies.
I would love to Google search Lemmy instead of reddit. I just don’t know how feasible it would be being that Lemmy is decentralized.
I could see Google integrating with the fediverse once it reaches critical mass. Using ActivityPub for indexing ought to be more efficient than the usual web crawling.