The world's most dangerous AI model was only public for 48 hours before someone with a laptop cracked it. 💀 Anthropic long considered its model "Mythos" too dangerous for public release. A few days ago, it was finally released as "Fable 5," featuring a security layer that secretly redirects sensitive queries like cybersecurity, chemistry, or bioweapons to a weaker model.
Anthropic spent over 1,000 hours searching for jailbreaks and found none that worked universally. The world's most famous AI jailbreaker, "Pliny the Liberator," needed significantly less time and claims to have bypassed the layer with a trick called "Pack Hunt": breaking dangerous queries into harmless components, using Cyrillic letters to circumvent keyword filters, and employing academic framing.
The uncomfortable truth: Every major AI model has been hacked so far. The question was never if, but how quickly. And that's precisely why AI safety is so important—a field currently dominated by open, unresolved questions.
Follow me for more AI content without the bullshit. 🧠
Sources: -anthropic.com/news/claude-fable-5-mythos-5-techcrunch.com/2026/06/09
-cybersecuritynews.com/anthropics-claude-fable-5-jailbroken-github.com/elder-plinius/CL4R1T4S-https://x.com/elder_plinius/status/2064776322979676227