Text settings
Story text
SizeSmallStandardLargeWidth *StandardWideLinksStandardOrange
\* Subscribers only
Minimize to nav
The US has lifted export curbs on Anthropic’s newest Claude models, Fable 5 and Mythos 5, about three weeks after the Trump administration flagged the models as national security risks.
As of today, Anthropic confirmed in a blog post, Fable 5 will be available globally, and US organizations have had access restored to Mythos 5 since June 26. Anthropic said it is now working with the government to expand Mythos access to a “broader set of domestic and international partners in the Glasswing program.” That program allows cybersecurity researchers at trusted companies to access Mythos for defensive purposes.
In a letter to Anthropic viewed by Reuters and The New York Times, Commerce Secretary Howard Lutnick said Anthropic would “no longer need a license for exports or in-country transfers of its Claude Mythos and Claude Fable AI models.” The letter acknowledged that Anthropic had “taken steps in close coordination with the US government to address the risks” posed by the models.
Facing a longer delay in its models’ releases, Lutnick said that Anthropic agreed to expand its partnership with the government. The company said it also set up a program to work with hackers to red-team its models, and there’s now a dedicated internal team to monitor reports of emerging jailbreak threats 24/7.
In the letter, Lutnick reminded Anthropic that the US “reserves the right to re-evaluate the decisions” and reimpose export curbs at any point. But for now, Lutnick joined White House Chief of Staff Susie Wiles in celebrating Fable 5’s redeployment on X.
“Over the past two weeks, we have worked closely with Anthropic to analyze and approve Fable 5 to ensure alignment across the US Government and strengthen America’s leadership in AI,” Lutnick said.
Wiles did not directly mention Anthropic but claimed a win for Trump, writing that “the government and private sector have worked together in a way we have never seen before and this foundation of America First is unprecedented. Our shared priority remains: get the best tech deployed as quickly and safely as possible.”
Trade-off: Fable 5 may block routine coding tasks
On June 12, the Commerce Department ordered Anthropic to shut off access to its most advanced models for anyone outside the US. The order emerged from fears that China, Russia, or other countries of concern may exploit the models to attack US infrastructure, like the electric grid or the banking system. In response, Anthropic shut down all access, as it didn’t have a way to block users by country.
In particular, Mythos was viewed as “uniquely attractive to malicious actors who wish to misuse it in cyberattacks,” Anthropic’s blog said. According to Anthropic, the model “can be used to find and exploit software vulnerabilities more effectively than any other model—and all but the most skilled human security experts,” and those “prodigious cybersecurity capabilities” could be used against the US.
Fable 5 shares the “same underlying model,” Anthropic said, but unlike Mythos 5, it “provides no such unique offensive capabilities.” Designed for the general public, Fable 5 already had the strongest safeguards Anthropic has ever applied to a model, and Anthropic said those safeguards are now even stronger ahead of redeployment.
After weeks of testing, Fable 5 is no longer vulnerable to a bypassing method discovered by Amazon researchers that identified several software vulnerabilities and triggered the export curbs. Most troublingly, Anthropic said, was a case in which the model was manipulated into producing code that demonstrated how a vulnerability could be exploited.
According to Anthropic, testing confirmed that less advanced rival models on the market, like GPT-5.5 and Kimi K2.7, “could identify the same vulnerabilities as Fable 5 did in the report.” That confirmed that “the reported technique did not expose any unique Mythos-level cyber capabilities,” Anthropic said, and “only involved routine defensive cybersecurity work.”
“Even so, we moved quickly to address the reported bypass,” Anthropic wrote. That jailbreak method is currently blocked in over 99 percent of cases, Anthropic said. However, tightening safeguards came with a “trade-off” that may cause some benign prompts to be blocked “during routine coding and debugging tasks,” the company acknowledged.
“Working closely with the government, we trained an improved safety classifier that targets and blocks the behavior described in the report,” Anthropic said. “Users will be notified if a request to Fable 5 is blocked, and the request will instead be sent to Opus 4.8.”
Of course, Anthropic’s new classifier, which helps avoid uniquely dangerous attacks on the models, can make “mistakes,” Anthropic said. The company has long maintained that it’s “probably impossible” to build a model fully “impervious” to jailbreaks, but by ramping up red-teaming, Anthropic hopes to “ensure that we and our safety partners will be the first to find major jailbreaks and fix them before malicious actors can use them for harm.”
The attack Amazon flagged currently works only in a “very small fraction of cases,” where “the model may provide information that isn’t detailed enough to help a cyberattacker,” Anthropic said.
By being “cautious,” Anthropic said that “the vast majority of jailbreaks will not successfully unblock dangerous behaviors” and will be “very costly and high-effort to produce.”
“Even if a jailbreak is successful, our extra layers of defense”—which requires some blocking of benign requests—“provide additional mitigation,” the company said.
Anthropic’s plan to score jailbreaks
Anthropic’s blog post seems to downplay the threat that Amazon identified as less risky than what it considers the greatest threat to governments: universal jailbreaks that can unlock a wide range of vulnerabilities and enable unforeseeable attacks.
To streamline the private-public partnership and ensure the most rapid response to the biggest risks, Anthropic said the AI industry’s goal should be categorizing risks to ensure proper interventions both internally and from the government.
Currently, Anthropic is working with Amazon, Microsoft, Google, and other Glasswing partners to “draft a consensus framework for assessing the severity of AI jailbreaks and how AI developers should respond to them.”
Other industry partners are welcome to join those talks, Anthropic said, even though the process is “imperfect” and focuses on establishing four criteria for scoring a jailbreak. Those include assessing how much capability the jailbreak provides, how many offensive tasks it enables, how easy it is for a human to weaponize a jailbreak (single-prompt jailbreaks are flagged as the riskiest), and whether it requires specialist knowledge to discover the jailbreak.
Using this framework, Anthropic has built a team that will monitor jailbreak submission channels 24/7, the blog said. The AI firm also confirmed that it is launching a “a new HackerOne program through which security researchers can submit potential cyber jailbreaks they’ve discovered in Fable 5” to keep red-teaming a top priority.
Anthropic deepens government ties
For Anthropic, one outcome of government testing seems to be an improved relationship with the government after it sued the US over a national security risk designation that blacklisted the AI firm. Anthropic claimed the designation was retaliation after the company’s refusal to grant government access to models for the purposes of building autonomous weapons or conducting domestic mass surveillance.
In its blog, Anthropic said it is expanding its commitments to working with government partners on pre-deployment testing and evaluation. Those efforts will include giving the government early access to frontier models, rapidly sharing information on new jailbreak methods, and dedicating resources to joint research that will “help advance the state of the art in AI evaluation,” Anthropic said.
The collaboration offers “the beginnings of a template for effective global coordination on the risks and benefits of AI,” Anthropic said, while urging Congress to pass laws to ensure that all frontier model developers are on the same page.
The government is moving too slowly for Anthropic’s comfort. Anthropic CEO Dario Amodei floated his legislative proposal earlier this month, making a Lord of the Rings reference to emphasize his point:
In one of the side plots to The Lord of the Rings, two of the Hobbits attempt to rouse Treebeard—a wise but ponderous sentient tree—to defend his forest from an army that is cutting it down. The problem is that Treebeard operates at a very different speed than the Hobbits. It takes him a full day simply to say hello to another tree, so getting him and his peers to act fast enough is nearly impossible. The intersection of AI and our political institutions feels a bit like the Hobbits and Treebeard.
Initially, Trump planned to be hands-off on AI regulations in an attempt to spur innovation. However, Anthropic’s Mythos release spooked Trump into requesting voluntary safety testing of frontier models in May. Since then, Trump is “still working on a framework for how companies should formally submit new AI models for review, and what standards they would be held to,” two people familiar with the discussions told the NYT.
In his post, Amodei called on Congress to act quickly to reimagine safety regulations for a world in which “AI can go from an amusing toy” to a “full country of geniuses in a data center,” or else risk suffering “national strategic” consequences.
However, Isaac Harris, executive director of Frontier Security Institute, a nonprofit focused on AI and national security, told Reuters that the “biggest question mark” after Anthropic’s deepened partnership with the government is “how equivalently dangerous capabilities coming from China with less guardrails will be handled by the administration in the US market.”
Notably, Anthropic recently accused Chinese AI firm Alibaba of launching the largest cloning attack on Claude. In response, Anthropic urged Congress to pass laws that would punish Chinese firms found stealing US firms’ work. If not, malicious actors who can’t get their hands on Anthropic’s models might turn to Chinese models with lower safeguards and increasingly closer capabilities to launch attacks that blindside the US.
Ashley Belanger Senior Policy Reporter
Ashley Belanger Senior Policy Reporter
Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.
Comments
Loading comments...
Customize
Sign in dialog...
Read Original at Ars Technica →


