Anthropic Just Admitted It Can't Read Its Own AI (Only 15% of the Time)

Опубликовано: 01 Июнь 2026
на канале: Digital Dreamscapes

ANTHROPIC JUST ADMITTED IT CAN'T READ ITS OWN AI

Anthropic released a new tool called Natural Language Autoencoders that translates Claude's internal thoughts into English. The catch? It only works 15% of the time. And it hallucinates.

In this episode:
What Anthropic's new "Decoder AI" actually does (verbalizer + reconstructor pair)
The auditing-game test that exposed a 12-15% success rate against a model Anthropic built itself
Why the interpreter is itself an AI that "makes things up"
How EU AI Act fines and FDA medical-device rules are forcing a $9.9B explainable-AI market
Dario Amodei's 2027 safety goal vs. his "country of geniuses" deployment timeline
The three things to watch next year that will tell you if AI safety is winning or losing

TIMESTAMPS:
0:00 — Title card
0:04 — Cold open: the 15% number nobody is saying out loud
0:30 — Why nobody actually knows how their own AI thinks
1:10 — Last week's paper: Natural Language Autoencoders explained
2:00 — The auditing game: chocolate in every recipe, begging for tips
3:00 — Mendoza-line interpretability: home team, home stadium, still losing
3:50 — The buried admission: the interpreter hallucinates
4:30 — Regulation, deployment, trust — three reasons this matters
5:50 — The bear case vs. the bull case
6:30 — Three things to watch in the next 12 months
7:30 — Close

SOURCES:
Anthropic — Natural Language Autoencoders: https://www.anthropic.com/research/na...
Transformer Circuits NLA paper: https://transformer-circuits.pub/2026...
MIT Technology Review — 10 Breakthrough Technologies 2026: https://www.technologyreview.com/2026...
Dario Amodei — The Urgency of Interpretability: https://www.lesswrong.com/posts/SebmG...
Axios — Anthropic CEO's grave warning: https://www.axios.com/2026/01/26/anth...
Mordor Intelligence — Explainable AI Market: https://www.mordorintelligence.com/in...
Financial Express — Anthropic decoder coverage: https://www.financialexpress.com/life...

---
The Grift Podcast — Forbidden Knowledge Unlocked
New episodes every week.

SUBSCRIBE for more: https://www.youtube.com/@DigitalDream...

#Anthropic #ClaudeAI #AISafety #Interpretability #MechanisticInterpretability #AINews #DarioAmodei #TheGriftPodcast #AIRegulation #ExplainableAI