Cybersecurity researchers express dissatisfaction with Anthropic's newly released AI model, Fable, due to its overly strict guardrails that hinder cybersecurity-related tasks. Users report that Fable rejects even basic requests linked to cybersecurity, citing a need for these restrictions to prevent potential misuse for malware development. Critics argue that the model misinterprets prompts related to cybersecurity, categorizing them erroneously and reverting to a previous version, Claude Opus 4.8. The concerns underscore the balance between safety measures and practical utility in cybersecurity applications.

The Claude Fable logo is displayed on the screen of a smartphone placed on a reflective surface onto which the company's icon is projected. Image Credits: Samuel Boivin/NurPhoto / Getty Images

Share on Facebook Share on X Share on LinkedIn Share on Reddit Share over Email Copy Share Link

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

Lorenzo Franceschi-Bicchierai

8:41 AM PDT · June 10, 2026

Share on Facebook Share on X Share on LinkedIn Share on Reddit Share over Email Copy Share Link

Anthropic released its latest model Fable on Tuesday, billing it as a public and limited version of its powerful and much-hyped cybersecurity model Mythos.

But not everyone is happy with the restrictions, and a number of cybersecurity researchers and professionals have aired complaints online.

“[Fable] rejects any request that could be tangentially cyber related. Even innocuous tasks like reading a blog post,” said Valentina “Chompie” Palmiotti, a well-known security researcher who works at IBM X-Force.

When a prompt triggers its guardrails, Fable pauses the chat and says that its “safety measures flagged this message for cybersecurity or biology topics.”

The guardrails were put in place to limit the risk that Fable could be used to develop malware or compromise software — a longstanding concern within Anthropic. The restrictions on biology come from a similar concern around developing biological weapons.

When the AI giant released Mythos in April, it restricted the model to a limited number of companies and organizations in what it called Project Glasswing, an effort to deploy the model to secure critical software and infrastructure. Last week, Anthropic expanded access to Mythos to hundreds of organizations in 15 countries.

But despite the good intentions, many cybersecurity experts are still put off by the haphazard nature of the restrictions. Matt Suiche, a cybersecurity veteran, told TechCrunch that “if you ask it to write secure code, it assumes it is cybersecurity related work instead of software engineering best practices, and you get downgraded.” Fable is programmed to fall back to Claude Opus 4.8 if it hits a guardrail. “It seems to be keyword based, so anything in the lexical field of ‘cybersecurity’ triggers the guardrails.”

Read Original at TechCrunch →

Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable | TechCrunch

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable