Skip to main content

The homepageThe VergeThe Verge logo.

The homepageThe VergeThe Verge logo.

Notifications

Notifications

Hamburger Navigation Button

Navigation Drawer

The VergeThe Verge logo.

closeClose

Search

LightSystemDark

Subscribe

Comments Drawer

Notifications

Comments

Loading comments

Getting the conversation ready...

  • AI

  • News

  • Anthropic

Anthropic apologizes for invisible Claude Fable guardrails

The company says it will make the covert safeguard preventing model distillation as visible as other safety measures.

The company says it will make the covert safeguard preventing model distillation as visible as other safety measures.

byRobert Hart

Jun 11, 2026, 11:40 AM UTC

  • Link

  • Share

  • Gift

STKB364_CLAUDE_D

STKB364_CLAUDE_D

Image: The Verge

Robert Hart

Robert Hartis a London-based reporter at The Verge covering all things AI and a Senior Tarbell Fellow. Previously, he wrote about health, science and tech for Forbes.

Anthropic has apologized for stealthily throttling its new AI model, Claude Fable 5, with hidden guardrails that undermine both researchers and rivals using it to develop competing systems. The company says it is reversing course and will be more transparent about when the restrictions kick in, even if that means Fable refuses more queries.

Fable is the first widely available model in Anthropic’s Mythos class of AI systems, a group the company has spent months warning are too dangerous for public release. Anthropic says it has addressed some of those risks by launching Fable with safeguards that prevent it from responding to certain “high-risk” queries. One of the areas Anthropic said it would restrict Fable’s responses is distillation, a technique for training smaller AI models using the outputs of larger ones.

In Fable’s system card — a public document AI developers release to explain how a system works — Anthropic said it would handle queries it believed were distillation attempts by altering and degrading the model’s answers directly. Users would not be notified that they had triggered the safety measure or informed that the responses had been changed.

Anthropic said it is now changing its approach to distillation: Queries will now fall back to Claude Opus 4.8, Anthropic’s previous flagship model, the company said in a post on X. Anthropic will prominently tell users too: “You will see this every time it happens.”

This is similar to how Fable handles queries in other high-risk areas. When safety features are triggered in areas like biology, chemistry, and cybersecurity, queries are routed through Opus 4.8 unless they are blocked outright under the company’s broader safety rules, such as those covering drugs, weapons, or other prohibited content. In some cases, notably biology, the safeguards have been calibrated so broadly that Fable is practically unusable for even basic queries, something Anthropic acknowledged in a comment to The Verge.

“Visible safeguards can be probed, so they have to be robust, which takes time to get right,” Anthropic wrote. “Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.”

The change follows intense backlash from the AI research community over Anthropic’s decision to silently limit users suspected of trying to distill Fable into competing models — a safeguard critics warned could also affect third parties trying to evaluate the frontier model. In the system card, Anthropic said newer models’ ability to accelerate AI development justified targeting those requests, noting that “using Claude to develop competing models already violates our Terms of Service.” Anthropic has previously accused Chinese rivals like DeepSeek of unfairly distilling its models on an “industrial” scale.

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

  • Robert Hart

  • AI

  • Anthropic

  • News

Most Popular

Most Popular

  1. Xbox warns of a ‘reset’ as it prepares for layoffs

  2. Microsoft restricts Claude Fable for employees over data retention concerns

  3. iFixit Trump phone teardown confirms it’s an HTC dupe

  4. Nearly a million passports and photo IDs were left unprotected on the public internet

  5. Claude Fable won’t answer basic biology questions

The Verge Daily

A free daily digest of the news that matters most.

Email (required)

Sign Up

By submitting your email, you agree to our Terms and Privacy Notice. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Advertiser Content From\ \ Sponsor Logo\ \ This is the title for the native ad\ \ Sponsor thumbnail

More in AI

Deezer launches an AI music detector for other streaming services

Deezer launches an AI music detector for other streaming services

Claude Fable won’t answer basic biology questions

Claude Fable won’t answer basic biology questions

Microsoft, like, totally gets why students are booing AI-pilled graduation speakers

Microsoft, like, totally gets why students are booing AI-pilled graduation speakers

The future of AI regulation is courting the strangest, most anxious bedfellows

The future of AI regulation is courting the strangest, most anxious bedfellows

Google won’t just admit it’s feeding YouTube creators to its music AI

Google won’t just admit it’s feeding YouTube creators to its music AI

Microsoft restricts Claude Fable for employees over data retention concerns

Microsoft restricts Claude Fable for employees over data retention concerns

Deezer launches an AI music detector for other streaming servicesDeezer launches an AI music detector for other streaming services

Deezer launches an AI music detector for other streaming services

Terrence O'Brien8:00 AM UTC

Claude Fable won’t answer basic biology questionsClaude Fable won’t answer basic biology questions

Claude Fable won’t answer basic biology questions

Robert HartJun 10

Microsoft, like, totally gets why students are booing AI-pilled graduation speakersMicrosoft, like, totally gets why students are booing AI-pilled graduation speakers

Microsoft, like, totally gets why students are booing AI-pilled graduation speakers

Mia SatoJun 10

The future of AI regulation is courting the strangest, most anxious bedfellowsThe future of AI regulation is courting the strangest, most anxious bedfellows

The future of AI regulation is courting the strangest, most anxious bedfellows

Tina NguyenJun 10

Google won’t just admit it’s feeding YouTube creators to its music AIGoogle won’t just admit it’s feeding YouTube creators to its music AI

Google won’t just admit it’s feeding YouTube creators to its music AI

Terrence O'BrienJun 10

Microsoft restricts Claude Fable for employees over data retention concernsMicrosoft restricts Claude Fable for employees over data retention concerns

Microsoft restricts Claude Fable for employees over data retention concerns

Tom WarrenJun 10

Advertiser Content From\ \ Sponsor Logo\ \ This is the title for the native ad

Top Stories

12:30 PM UTC

I went to the woods to drink surprisingly great espresso

11:30 AM UTC

The library rules (and so do library streaming services)

10:40 AM UTC

iFixit Trump phone teardown confirms it’s an HTC dupe

Jun 10

Nearly a million passports and photo IDs were left unprotected on the public internet

Jun 10

Xbox warns of a ‘reset’ as it prepares for layoffs

Notifications Drawer

The VergeThe Verge logo.

Sign in to see your notifications or create an account to join the conversation.

Sign in

Read Original at The Verge