Anthropic的“过于危险，不宜发布”的AI模型，在发布当天就被Discord群组访问。

Anthropic的“过于危险，不宜发布”的AI模型，在发布当天就被Discord群组访问。
Anthropic's 'Too Dangerous To Release' AI Model Was Accessed By Discord Group On Day One

原始链接: https://www.zerohedge.com/ai/anthropics-too-dangerous-release-ai-model-was-accessed-discord-group-day-one

Anthropic公司功能强大的AI模型“神话”（Mythos），能够发现和利用零日漏洞，在2026年4月有限发布后仅数小时就被攻破。访问权限通过“玻璃翼计划”（Project Glasswing）授予了大约50家经过审查的组织，包括苹果和微软等大型科技公司，甚至美国国家安全局，旨在主动识别安全漏洞。然而，一个组织使用第三方承包商的凭据和巧妙的“网络侦查”推断出了API端点，从而获得了未经授权的访问。据报道，他们迄今为止一直将“神话”用于非恶意目的。该事件凸显了即使在创建者谨慎的情况下，先进AI周围安全性的脆弱性。Anthropic正在调查，但此次入侵引发了人们对其他受限制AI系统以及供应链中漏洞的潜在泄露的担忧。尽管美国国防部出于伦理方面的担忧将其列为“供应链风险”，但美国政府仍在积极利用“神话”来发现漏洞，这表明了与这项强大但可能危险的技术之间的复杂关系。

原文

Anthropic's 'Mythos' model is extraordinarily dangerous. The company itself warned that it could autonomously identify and exploit zero-day vulnerabilities in every major operating system, every major web browser, and every critical software library on Earth. And because of this offensive cybersecurity power, Anthropic refused to release Mythos publicly - and instead tightly restricted access through 'Project Glasswing' to roughly 50 carefully vetted organizations - 12 named launch partners plus more than 40 additional critical software and government entities, including the U.S. National Security Agency (NSA).

Yet within hours of the limited rollout announcement on April 7, 2026, a small group of unauthorized users in a private Discord server had already broken in.

The breach, reported by Bloomberg on Tuesday, reveals how fragile the safeguards around frontier AI models can be. According to the report, the group gained access using a surprisingly low-tech combination: legitimate credentials from a third-party contractor involved in Anthropic's evaluations, plus clever internet sleuthing to guess the hidden API endpoint by reverse-engineering Anthropic's internal naming conventions (patterns inferred from an earlier Mercor data leak).

They have reportedly been using Mythos regularly for nearly two weeks. Sources emphasize the usage has been non-malicious so far - things like building simple websites - rather than launching cyberattacks.

"We’re investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments," a spokesperson said in a statement, adding that there's no evidence that the access went beyond a third-party vendor's environment or that it is impacting any of Anthropic's systems.

Project Glasswing

In early April, Anthropic launched Project Glasswing, a defensive cybersecurity initiative built around Mythos Preview. The 12 launch partners included Amazon Web Services, Apple, Microsoft, Google, Cisco, CrowdStrike, Palo Alto Networks, NVIDIA, Broadcom, JPMorgan Chase, and the Linux Foundation, along with over 40 additional critical software organizations. The explicit goal was to give these defenders a head start: let Mythos hunt for vulnerabilities in their own systems and major open-source projects before malicious actors could weaponize the same capabilities.

Anthropic's own red-team testing reportedly showed Mythos could find and chain complex zero-days that had remained hidden for decades in software like Linux, OpenBSD, and FFmpeg.

Even as the Pentagon formally labeled Anthropic a “supply-chain risk” in March 2026 - citing the company’s refusal to remove ethical guardrails that would allow its models to be used for mass domestic surveillance and autonomous weapons - other key parts of the U.S. government have moved with urgency to embrace the very same technology. The National Security Agency is already actively using Claude Mythos Preview, while the White House’s Office of Management and Budget circulated an internal memo on Monday directing federal agencies to begin leveraging the model for vulnerability discovery in government networks. The Treasury Department has been particularly aggressive, rushing to secure access and convening major bank CEOs for urgent red-teaming sessions after being warned that Mythos could "hack every major system."

A Low-Tech Breach

The unauthorized access was deceptively simple. One member of the Discord group (a private forum focused on hunting unreleased AI models) had legitimate access as a worker at a third-party contractor. Using knowledge of Anthropic's naming patterns, the group correctly guessed the private API endpoint for Mythos Preview on the very same day the limited release was announced.

Once inside, they continued using the model without triggering obvious alarms.

So, here's where we are: these AI models are becoming so powerful that even their creators treat them with extreme caution - yet the operational security surrounding them can still fall to basic tactics like credential misuse and URL guessing.

As of Wednesday, Anthropic has offered no further updates on its investigation, no timeline, and no announcement of technical fixes such as credential rotation or endpoint randomization. There is still no public evidence of malicious use by the Discord group - however, the breach raises serious questions about how many other restricted AI systems might be leaking through similar third-party or supply-chain vulnerabilities.