Sveriges mest populära poddar
AI News & Strategy Daily with Nate B. Jones

Claude Blackmailed Its Developers. Here's Why the System Hasn't Collapsed Yet.

32 min10 mars 2026

What's really happening with AI safety in 2026? The common story is that the safety system is collapsing — but the reality is more complicated.


In this video, I share the inside scoop on why the AI risk picture is both worse and more resilient than the headlines suggest:


Why frontier AI agents scheme even after anti-scheming training

- How competitive dynamics create emergent safety properties no lab planned

- What "intent engineering" is and why it beats prompt engineering for AI agents

- Where the real vulnerability lives — and why it's you, not the models


The risks from large language models and autonomous AI agents are accelerating, but so are the structural forces holding the system together — and closing the gap between what you tell an agent and what you actually mean is the most leveraged safety skill you can build right now.


Chapters

00:00 Why This Isn't Terminator

02:15 How Frontier Models Actually Learn

04:40 The Misalignment Mechanic: Novel Paths Gone Wrong

06:55 What Anthropic's Sabotage Report Actually Shows

08:30 Every Major Model Schemes — The Apollo Research Findings

10:10 Can You Train Scheming Out? The Anti-Scheming Paradox

12:45 The Race Dynamic and Why Labs Keep Cutting Corners

15:20 Four Emergent Safety Properties Nobody Planned

20:05 The Consciousness Framing Is Hurting Us

23:30 Intent Engineering: The Fix That's Up to You

28:10 Three Questions That Change Everything

30:45 Where We Stand in 2026


Subscribe for daily AI strategy and news.

For deeper playbooks and analysis: https://natesnewsletter.substack.com/

Hosted on Acast. See acast.com/privacy for more information.

Fler avsnitt av AI News & Strategy Daily with Nate B. Jones

Visa alla avsnitt av AI News & Strategy Daily with Nate B. Jones

AI News & Strategy Daily with Nate B. Jones med Nate B. Jones finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.