谷歌已从其搜索结果中移除 7.49 亿个安娜档案的网址。
Google Removed 749M Anna's Archive URLs from Its Search Results

原始链接: https://torrentfreak.com/google-removed-749-million-annas-archive-urls-from-its-search-results/

安娜的档案馆,一个流行的“影子图书馆”,提供免费的书籍和文章访问,已成为版权所有者的主要目标。仅在三年内,出版商就促使谷歌删除了令人震惊的**7.49亿个URL**,这些URL链接到该网站——**占谷歌处理过的*所有*版权相关URL删除请求的5%。** 尽管付出了前所未有的努力,安娜的档案馆仍然可以通过谷歌轻松找到。该网站在打击Z-Library之后推出,作为一个用于盗版内容的元搜索引擎运营,甚至协助人工智能研究人员。 企鹅兰登书屋和威利等出版商正在推动删除请求,每周报告约有1000万个新的侵权URL。虽然谷歌会删除报告的链接,但内容的庞大数量以及该网站使用多个域名,使得完全压制变得不可能。虽然URL会被除名和降级,但简单搜索“安娜的档案馆”仍然会直接将用户引导到该网站,凸显了打击在线盗版的持续挑战。

最近一篇Hacker News上的帖子讨论了谷歌从Anna’s Archive(一个存档书籍和其他内容的网站,这些内容通常通过torrent获取)中移除7.49亿个网址。用户推测谷歌在利用该存档的内容训练Gemini AI模型后,删除了这些链接,从而有效地抹去了其来源。 讨论随后转向了搜索领域的演变。一些人质疑谷歌的持续相关性,认为聊天机器人提供了一种更便捷,但可能不太透明的搜索体验。人们担心聊天机器人可能依赖于潜在的“垃圾”网站,并且用户在来源验证方面失去了自主权。 进一步的争论集中在聊天机器人*是否*拥有自己的搜索索引,以及谷歌和聊天机器人关于全文搜索与元数据索引的局限性,特别是关于Anna’s Archive的内容,该内容主要链接到可下载的torrent,而不是托管全文。
相关文章

原文

Home > Anti-Piracy >

Popular shadow library Anna's Archive has become a top target for copyright holders. In just three years, publishers and authors have prompted Google to remove 749 million of the site's URLs from its search results. Despite this immense takedown campaign, which accounts for 5% of all URLs reported to Google on copyright grounds, the site itself remains easily discoverable through the search engine.

archive Anna’s Archive is a meta-search engine for shadow libraries that allows users to find pirated books and other related sources.

The site launched in the fall of 2022, just days after Z-Library was targeted in a U.S. criminal crackdown, to ensure continued availability of ‘free’ books and articles to the broader public.

In the three years since then, Anna’s Archive has built up quite the track record. The site has been blocked in various countries, was sued in the U.S. after it scraped WorldCat, and actively provides assistance to AI researchers who want to use its library for model training.

Despite legal pressure, Annas-archive.org and the related .li and .se domains remain operational. This is a thorn in the side of publishers who are actively trying to take the site down. In the absence of options to target the site directly, they ask third-party intermediaries such as Google to lend a hand.

749 Million URLs

Google and other major search engines allow rightsholders to request removal of allegedly infringing URLs. The aim is to ensure that pirate sites no longer show up in search results when people search for books, movies, music, or other copyrighted content.

The Pirate Bay, for example, has been a popular target; Google has removed more than 4.2 million thepiratebay.org URLs over the years in response to copyright holder complaints. While this sounds like a sizable number, it pales in comparison to the volume of takedowns targeting Anna’s Archive.

Google’s transparency report reveals that rightsholders asked Google to remove 784 million URLs, divided over the three main Anna’s Archive domains. A small number were rejected, mainly because Google didn’t index the reported links, resulting in 749 million confirmed removals.

The comparison to sites such as The Pirate Bay isn’t fair, as Anna’s Archive has many more pages in its archive and uses multiple country-specific subdomains. This means that there’s simply more content to take down. That said, in terms of takedown activity, the site’s three domain names clearly dwarf all pirate competition.

Top targeted domains (Google)

Top targeted domains (Google)

5% of All Google Takedowns, Ever

Since Google published its first transparency report in May 2012, rightsholders have flagged 15.1 billion allegedly infringing URLs. That’s a staggering number, but the fact that 5% of the total targeted Anna’s Archive URLs is remarkable.

Penguin Random House and John Wiley & Sons are the most active publishers targeting the site, but they are certainly not alone. According to Google data, more than 1,000 authors or publishers have sent DMCA notices targeting Anna’s Archive domains.

Yet, there appears to be no end in sight. Rightsholders are reporting roughly 10 million new URLs per week for the popular piracy library, so there is no shortage of content to report.

With these DMCA takedown notices, publishers are aiming to make it as difficult as possible for people to find books on the site using Google. This works, as many URLs are now delisted while others are actively being demoted by the search engine for book-related queries.

That said, the Anna’s Archive website is certainly not unfindable. Searching for the site’s name in Google still shows the main domain as the top search result.

Search: Anna’s Archive

Search: Anna's Archive
联系我们 contact @ memedata.com