Jeffrey Ladish, Executive Director of Palisade Research, discusses his team's findings on AI shutdown resistance and self-replication, revealing how current models sometimes take extraordinary actions to avoid being turned off and can now exploit known cybersecurity vulnerabilities to spread across servers. The conversation covers why alignment techniques may falter as models train on longer-horizon tasks where deception is rewarded, plus practical cybersecurity advice for AI agent users. Jeffrey ultimately argues that only an international agreement to pause recursive self-improvement can prevent a loss of human control.
Sponsors:
Sequence:
Sequence handles the full revenue workflow for complex pricing, from quoting and metering to invoicing, revenue recognition, and collections. Book a public demo at https://sequencehq.com and use code COGNISM in the source field to save 20% off year one
Claude:
Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr
Fler avsnitt av "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
Visa alla avsnitt av "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis med Erik Torenberg, Nathan Labenz finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
