网络上的机器人情况比你想象的更糟。
The bot situation on the internet is worse than you could imagine

原始链接: https://gladeart.com/blog/the-bot-situation-on-the-internet-is-actually-worse-than-you-could-imagine-heres-why

该网站受到“阿努比斯”系统的保护,该系统旨在阻止人工智能公司过度抓取其内容。抓取会导致所有用户出现停机,而阿努比斯的目标是在可访问性和保护之间取得平衡。 阿努比斯通过要求访问者付出少量计算成本来工作——类似于反垃圾邮件技术。对于个人用户来说,这种成本可以忽略不计,但对于大规模抓取操作来说,它会变得显著且昂贵。 目前,阿努比斯可能会偶尔影响合法用户,但这是一种临时解决方案。开发者正在努力开发更精确的方法来识别和阻止自动“无头浏览器”,而不会影响真实用户。使用 JavaScript 拦截插件(如 JShelter)的用户可能需要禁用它们才能使该网站正常工作。

## 机器人流量激增与解决方案 最近的 Hacker News 讨论强调了对网站产生重大影响的恶意机器人流量显著增加。用户报告每天收到数十万请求,通常来自中国,消耗资源并模仿合法流量,使得过滤变得困难。 获得关注的解决方案是“Anubis”,一个简单的工作量证明系统。早期数据表明,机器人活动大幅减少——一位用户在实施后,请求量从数十万降至每天仅 11 个。用户还建议参考 `llms.txt` 文件,允许合法的 LLM 爬虫访问,同时阻止恶意机器人。 对话涉及潜在的长期解决方案,例如用于请求身份验证的政府数字身份,并指出区分机器人驱动和人为驱动的垃圾邮件越来越困难,即使是对于新账户注册也是如此。人们也对记录用户活动以进行机器人检测的隐私影响表示担忧。
相关文章

原文

Loading...

You are seeing this because the administrator of this website has set up Anubis to protect the server against the scourge of AI companies aggressively scraping websites. This can and does cause downtime for the websites, which makes their resources inaccessible for everyone.

Anubis is a compromise. Anubis uses a Proof-of-Work scheme in the vein of Hashcash, a proposed proof-of-work scheme for reducing email spam. The idea is that at individual scales the additional load is ignorable, but at mass scraper levels it adds up and makes scraping much more expensive.

Ultimately, this is a placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to users that are much more likely to be legitimate.

Please note that Anubis requires the use of modern JavaScript features that plugins like JShelter will disable. Please disable JShelter or other such plugins for this domain.

联系我们 contact @ memedata.com