I just sent a DMCA takedown last week to remove my site. They’ve claimed to follow meta tags and robots.txt since 1998, but no, they had over 1,000,000 of my pages going back that far. They even had the robots.txt configured for them archived from 1998.
I’m tired of people linking to archived versions of things that I worked hard to create. Sites like Wikipedia were archiving urls and then linking to the archive, effectively removing branding and blocking user engagement.
Not to mention that I’m losing advertising revenue if someone views the site in an archive. I have fewer problems with archiving if the original site is gone, but to mirror and republish active content with no supported way to prevent it short of legal action is ridiculous. Not to mention that I lose control over what’s done with that content – are they going to let Google train AI on it with their new partnership?
I’m not a fan. They could easily allow people to block archiving, but they choose not to. They offer a way to circumvent artist or owner control, and I’m surprised that they still exist.
So… That’s what I think is wrong with them.
From a security perspective it’s terrible that they were breached. But it is kind of ironic – maybe they can think of it as an archive of their passwords or something.
Not to mention that I’m losing advertising revenue if someone views the site in an archive.
No one is using Internet Archive to bypass ads. Anyone who would think of doing that already has ad blockers on.
You misunderstood. If they view the site at Internet Archive, our site loses on the opportunity for ad revenue.
I completely understood. No one is going to IA as their first stop. They’re only going there if they want to see a history change or if the original site is gone.
What do you mean by “engagement”, exactly? Clicking on ads?
In SEO terms user engagement refers to how people interact with the website. Do they click on another link? Does a new blog posting interest them?
Lmao you think Google needs to go through Archive to scrape your site? Delusional.
Any activiity from Google is easier to track and I have a record if who downloaded content if it’s coming from my servers.
The mechanisms used to serve ads over the internet nowadays are nasty in a privacy sense, and a psychological manipulation sense. And you want people to be affected by them just to line your pockets? Are you also opposed to ad blockers by any chance?
I agree that many sites use advertising in a different way. I use it in the older internet sense – someone contacts me to sponsor a page or portion of the site, and that page gets a single banner, created in-house, with no tracking. I’ve been using the internet for 36 years. I’m well aware of many uses that I view as unethical, and I take great pains not to replicate them on my own site.
I disapprove of ad blockers. I approve of things that block tracking.
As far as “lining my own pockets” goes, I want to recoup my hosting costs. I spend hours researching for each article/showcase, make the content free to view, and then I’m expected to pay to share it with anyone who’s interested? I have a day job. This is my hobby, but it’s also my blood, sweat, and tears.
And how do you suggest a site which has been wiped off the face of the internet gets archived? Maybe we need to invest in a time machine for the Internet Archive?
archive.org could archive the content and only publish it if the page has been dark for a certain amount of time.
SEO killed the internet. You’re literally part of the reason why people go look for alternatives to viewing your website, no one wants ads.
Did you just draw comparison between redistribution of publicly available content and…rape? Dang.
how do you expect an archive to happen if they are not allowed to archive while it is still up. How are you suposed to track changed or see how the world has shifted. This is a very narrow and in my opinion selfish way to view the world
how do you expect an archive to happen if they are not allowed to archive while it is still up.
I don’t want them publishing their archive while it’s up. If they archive but don’t republish while the site exists then there’s less damage.
I support the concept of archiving and screenshotting. I have my own linkwarden server set up and I use it all the time.
But I don’t republish anything that I archive because that dilutes the value of the original creator.
A couple of good examples are lifehacker.com and lifehack.org. Both sites used to have excellent content. The sites are still up and running, but the first one has turned into a collection of listicles and the second is an ad for an “AI-powered life coach”. All of that old content is gone and is only accessible through the Internet Archive.
In fact, many domains never shut down, they just change owners or change direction.
About the only thing I can agree with you on here is I don’t like when people on Wikipedia archive a link and then list that as the primary source in the reference instead of the original link. Wikipedia (at least in English) has a proper method to follow for citations with links and the archived version should only become the primary if the original source is dead or has changed and no longer covers the reference.
They should also honor a DMCA takedown and robots.txt, but at least with the DMCA I’m sure there’s a backlog. Personally I’ve always appreciated the archive’s existence, though, and would think their impact is small enough that it’s better to have them than block them.