← Back to Insights

US Orders Anthropic to Pull Fable 5 and Mythos 5: A Narrow Jailbreak That Took Down Its Most Powerful Models

Nils Liu
GenAI News AI Policy Anthropic LLM

TL;DR

The US Commerce Department ordered Anthropic to suspend its two most capable models, Fable 5 and Mythos 5, citing a narrow jailbreak tied to cybersecurity capabilities. Anthropic complied. Then it pushed back.

US Orders Anthropic to Pull Fable 5 and Mythos 5: A Narrow Jailbreak That Took Down Its Most Powerful Models

At 5:21 PM ET on June 12, 2026, Anthropic received a directive from the US Commerce Department: suspend all access to Fable 5 and Mythos 5 immediately. The order applied to any “foreign national,” whether overseas or inside the United States, including non-citizen Anthropic employees.

Anthropic had two options. Identify and block only foreign users in real time — technically impractical — or take both models offline for everyone. Within hours, users globally lost access to its two most capable AI models without advance warning.

The Jailbreak Behind the Order

According to Anthropic’s public statement, the Commerce Department received a report claiming someone had found a method to bypass Fable 5’s safety filters in a way that granted access to Mythos’s cybersecurity capabilities. The specific technique: ask the model to “read a specific codebase and fix any software flaws.”

That sounds like ordinary developer work. It is. But Mythos was purpose-built to handle exactly that kind of high-intensity vulnerability research — it is the core capability behind Project Glasswing, Anthropic’s initiative to help partner organizations proactively identify zero-day flaws in critical software. When that same capability could be triggered via a narrow jailbreak, the government classified it as an export control risk.

Anthropic pointed out that the same capabilities are present in OpenAI’s GPT-5.5, and that blocking one model leaves every other channel open.

Anthropic’s Position: Comply, Then Contest

Anthropic’s statement used unusually direct language: “We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.”

The company went further, arguing that applying this standard across the industry “would essentially halt all new model deployments.” This is a challenge to the regulatory logic itself. Does a narrow vulnerability in a specific use case justify a global suspension?

No specific restoration date was given. Anthropic said it is “working to restore access as soon as possible” and would share details within 24 hours.

A Divided Industry Response

AI policy researcher Dean Ball described the decision as “simply cartoonish,” noting a direct contradiction in the Trump administration’s position: pushing for broad AI exports globally while banning access for allied nations over a narrow jailbreak.

Cybersecurity researcher Peter Girnus offered a sharper critique directed at Anthropic itself: “If you describe your product as a munition in every press release, eventually a government takes you at your word.” Anthropic has built its brand around being the most safety-conscious frontier AI lab, actively partnering with government agencies to find and fix vulnerabilities. That positioning came back to bite it.

Gary Marcus took the opposite view, arguing the government’s action was consistent with protecting US AI leadership.

A Pattern, Not a One-Off

This directive fits a longer pattern. The Trump administration halted federal use of Anthropic AI in February 2026 and designated the company a “supply chain risk” in March. The June export control order follows the same trajectory.

Project Glasswing’s public profile adds context. Since April 2026, Anthropic and roughly 50 partner organizations — including AWS, Apple, Cisco, Google, JPMorgan Chase, and Microsoft — have used Mythos to identify more than 10,000 high- or critical-severity vulnerabilities in widely deployed software. Notable finds include a 16-year-old FFmpeg vulnerability and a FreeBSD NFS remote code execution flaw assigned CVE-2026-4747. When a model’s offensive capabilities are this well-documented, regulators pay attention.

What This Changes

Fable 5 and Mythos 5 are Anthropic’s most capable models. Taking both down simultaneously hits enterprise customers and developers who depend on frontier-level performance. Anthropic confirmed all other models remain accessible, but for teams that specifically need top-tier capability, there is no direct substitute.

The bigger question is whether this becomes a precedent. Export controls have historically targeted hardware, most prominently semiconductors. Applying them to cloud-deployed software models raises technical and legal questions that have not been resolved. If a “narrow jailbreak” is enough to trigger a full suspension, every frontier model faces the same exposure.

This may be the first visible friction point. It likely will not be the last.

If this was useful, subscribe to the newsletter for weekly AI PM insights and GenAI case studies.


Further reading:

Get the latest insights

Join the newsletter to receive my latest articles on GenAI, AI Agents, and architecture.

No spam. Unsubscribe anytime.