Hey everyone,

This isn’t an announcement, just wanted peoples thoughts on this.

I think everyone knows searching the fediverse can be better. Googling doesn’t work too well, etc. So I wanted to do my part and help out.

Indexing all posts, etc is quite a lot to handle, so I wanted to start small and just focus on video search. I’ve started indexing videos from Peertube and other video websites. (Even YouTube but this could be removed to just focus on independent sites)

I know Peertube has their own search engine for videos. I will be reaching out to them. Compared to my site I’m planning it’ll have other video sources and be easier to use.

So that leads to feedback from you guys.

  • What do you think about indexing videos posted on the fediverse and other independent platforms?
  • Are there similar services?
  • Am I just wasting my time?

I found FediSearch, and also this post basically saying that a fediverse search engine would just be used as a tool by trolls.

permalink
report
reply
11 points

It’s worth noting that since FedSearch, Mastodon has actually natively implemented opt-in search on posts.

permalink
report
parent
reply
9 points

That’s a good point. But those people can be banned? I guess Reddit handles this by moderation and archiving old posts.

permalink
report
parent
reply
4 points

Yes, but moderation teams on the fediverse are very small, and by nature of it, can make hundreds of account of different servers all trailing that would need to be individually sought out and banned.

It is a game of cat & 100 mice

permalink
report
parent
reply
1 point

People will take the harassment off site especially if they are dedicated enough or use it to scrape for potential personal info to publicly release.

permalink
report
parent
reply
12 points

How is that different from Reddit? If trolls want to search and scrape and find information on people, they’re going to. You can’t put your information on the open Internet and not appreciate there’s always a danger of that.

permalink
report
parent
reply
4 points

That post wasn’t claiming that a search engine would only be used by trolls; it was explaining that they shut down their project because a chunk of the fediverse thinks that and complain about any search engine projects. Discoverability is one of the network’s biggest challenges and a search engine could really help with that.

permalink
report
parent
reply

Yes, not only used by trolls, but would be a tool that could be leveraged by trolls. And I think the fediverse makes it easier to establish instances for marginalized groups, but also has more admins that just don’t want trolls because nobody here is making $ off them like the corporate socials are. I think if adding search that is going to try and vacuum up everyone’s posts in the fediverse and make them easily sortable/targetable without instance admins permission, then that isn’t cool. If someone is running a general instance that covers nothing that a troll could latch onto and wants the instance catalogued and searchable then that’s fine by me. I don’t think boys should be doing that to the fediverse as a whole without admin permission though.

permalink
report
parent
reply
2 points

I don’t think an admin’s permission has anything to do with it. If you post publicly on the fediverse, your posts are public. You should have the option to opt out of any indexing (just like you do for the rest of the open web). But saying its ok for you to read this post if it happens to come across your feed but you shouldn’t be allowed to find it via a search is ridiculous. Users get to make the choice with each post whether its public or not, but they don’t get to control how people consume those public posts.

permalink
report
parent
reply
-11 points

>muh trolls

God shut up please. Why do you have to ruin something amazing like searching the entire fediverse with a meaningless arguments about muh trolls.

permalink
report
parent
reply
4 points
*

But they are correct. There are vulnerable groups of people who have a harassment risk against them. We share the fediverse with others, be mindful of that. Making a search engine or an archiver for lemmy is such a good idea with how it functions! but for the wider fediverse… that’s just directly contradictory to its culture unless it can be opted in by instance and users

permalink
report
parent
reply
10 points

There are vulnerable groups of people who have a harassment risk against them.

People that are at risk for what they write on the public internet should be protected and empowered by having better privacy tools, not by pretending that they can have a “safe space” on the public internet.

There is no such thing as privacy on the internet. The Fediverse makes it seem that it mitigates the surveillance problem by spreading the information around and not having it under the control of one single large entity, but the truth is that the Fediverse makes it actually easier for dedicated malicious actors to collect data and reach their targets.

permalink
report
parent
reply
2 points

Those vulnerable groups should have the tools to protect themselves, but that shouldn’t stop the rest of us from having a functional and discoverable system. The internet, and the fediverse specifically, have always been a semi-public space and searchability has been a part of that since the beginning.

permalink
report
parent
reply
-15 points

Unpopular opinion: overly sensitive people should not be allowed to use the internet. Why should everything revolve around their insecurities? Grow a thicker skin or stop using the internet.

permalink
report
parent
reply

Your post is of much privilege, marginalized groups care about trolls very much.

permalink
report
parent
reply
1 point
*

yes, anything you post on internet can be indexed. if someone wants to post some thing on their little private garden they are options for that too. fediverse has potential to grow and if we try to stop everything that could help to grow it as “no only trolls will use it”, after some point no body except people who complaint won’t use it. do you want fediverse to be your own little echo chamber?

permalink
report
parent
reply
23 points
*

Well, please make sure it respects post privacy at least but also realize that on the microblogging side of the fediverse, they may not take kindly to this prospect at all. People who start these kinds of projects are often harassed or at least receive passive hostility. Making it opt in instead out of opt out in some capacity is best.

permalink
report
reply
39 points
*

I disagree. Post privacy sure, but the internet is by definition public. Anything you put out there can be used for pretty much everything, the original rules of the internet apply. I’d be happy to see an easy opt out on the engine to remove yourself, but if everything is opt in it’ll never get off the ground.

permalink
report
parent
reply
Deleted by creator
permalink
report
parent
reply
2 points

again it’s not going to servers and scraping data, it would be sitting somewhere receiving public data that is pushed out. There’s no malicious getting around privacy settings, if it’s pushed out then it’s free game. I agree about post privacy, but again activitypub already takes care of that

permalink
report
parent
reply
3 points

That’s not how the fediverse functions and approaching it that way is a problem waiting to happen. I’m stating so as a warning to be mindful of the culture of the way the fediverse itself functions. This is not Reddit, we share the fediverse with other software with different uses and features and we need to be mindful of that especially when building these kinds of tools. Making it opt out not only places a burden on smaller instances but presents a potential harassment risk for instances with vulnerable people on other fediverse platforms. As well, it is contrary to the entire way specific other activitypub instances operate. The fediverse is like a city we share with others, if Lemmy is not mindful of that city’s culture then people will promptly give them the boot.

I’m not saying user by user opt in either, but instance by instance. Lemmy needs a tool of archiving especially. There is already cultural clashes I see occurring with the rest of the fediverse. Post like these of potential tools when it seems like the creator doesn’t know the messy history behind previous projects like them in the fediverse make me fearful of the clashes coming to fruition.

permalink
report
parent
reply
13 points

Well that’s why I’m asking for input. And I won’t launch this on every instance without letting them know. Baby steps.

permalink
report
parent
reply
6 points
*

But ActivityPub already publishes all of the data out. I don’t think this is going out to servers asking for data, it’s listening to public data being broadcasted out. If people are broadcasting over activitypub then they’re okay with it being shared.

If they don’t want it shared then they don’t have to publish ActivityPub to anyone. They can defederate from the search federation. Those tools already exist.

permalink
report
parent
reply
2 points

That’s not how the fediverse functions

That is how the fediverse functions. Instances send posts to anyone who request it, unless a block is in place. ActivityPub is opt-out and the web has always worked this way.

be mindful of the culture

There is no “the culture” on the fediverse. Your talking about a subgroup, which has a different opinion from other subgroups. They don’t get to define “culture” on the fediverse.

permalink
report
parent
reply

As the fediverse is almost exclusively run by volunteers that are paying server bills and being admins, I could see some larger instances not taking kindly to this, especially depending on how much stress it would be putting on some already at capacity servers.

permalink
report
parent
reply
18 points

Ideally, OP’s crawlers will just come from their own instance that other instance owners can defederate from if they want to opt out.

permalink
report
parent
reply
0 points

How much bandwidth do you suppose a crawler would use? I’d guess very little

permalink
report
parent
reply
20 points

A good search engine would be quite important. One thing that annoyed me back on the site that should not be named was that their search engine was completely useless - It was not even capable to find posts where I entered verbatim text of.

Having a good search engine that can actually find a post I was looking for would be a major plus for the fediverse.

permalink
report
reply
4 points

Yep, the idea is to simulate the type of results you get from Google. People trust Lemmy answers more than spam sites now a days.

permalink
report
parent
reply
18 points

Why wouldnt people want do have search engine? Without it Fediverse stands no chance against non-free internet. Everything posted here would be much more valuable if it was searchable. Now comment posted once is viewed only until post gets less popular. Any other site of this kind displays answers decades old. Privacy isnt issue as everything posted here is available to everyone on internet.

permalink
report
reply
4 points

I mean they are posting on the public internet, they should know that it can be read by anyone. I like the idea of users opting out.

permalink
report
parent
reply
12 points
*
Deleted by creator
permalink
report
parent
reply
8 points

Good point. They should know they are making public comments. If you want it private then send a private message.

permalink
report
parent
reply
11 points

Is this something you can point yacy at?

permalink
report
reply
2 points

I heard it’s not optimized well but I’ll take a look at it.

permalink
report
parent
reply
1 point

I’d be interested to know what “not optimized” means.

permalink
report
parent
reply
1 point

I heard it will send out requests as fast as possible essentially creating a DOS attack but I could be wrong. Also the UI can use some improvement.

permalink
report
parent
reply

Fediverse

!fediverse@lemmy.world

Create post

A community to talk about the Fediverse and all it’s related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

  • Posts must be on topic.
  • Be respectful of others.
  • Cite the sources used for graphs and other statistics.
  • Follow the general Lemmy.world rules.

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

Community stats

  • 5.6K

    Monthly active users

  • 1.9K

    Posts

  • 66K

    Comments