Dr. @Steven Byrnes is one of the few people who both understands why alignment is hard, and is taking a serious technical shot at solving it. He's the author of these recently popular posts:
After his UC Berkeley physics PhD & Harvard postdoc, he became an AGI safety researcher at Astera. He's now deep in the neuroscience of reverse-engineering how human brains actually work, knowledge that could plausibly help us solve the technical AI alignment problem.
He has a whopping 90% P(Doom), but argues that LLMs will plateau before becoming truly dangerous, and the real threat will come from next-generation “brain-like AGI” based on actor-critic reinforcement learning.
We cover Steve's "two subsystems" model of the brain, why current AI safety approaches miss the mark, Steve's disagreements with "social evolution" [...]
---
Outline:
(01:18) Video
(01:24) Podcast
(01:44) Transcript
(01:47) Cold Open
(02:13) Introducing Steven Byrnes
(09:10) Path to Neuroscience and AGI Safety
(18:53) Research Direction and Brain-like AGI
(23:47) The Two Brain Subsystems
(45:28) Language Acquisition and Learning
(50:19) LLM Limitations
(01:10:10) Brain-like AGI
(01:16:04) Actor-Critic Reinforcement Learning
(01:41:10) Alignment Solutions and Reward Functions
(01:48:31) Actor-Critic Model and Brain Architecture
(02:00:33) Current AI vs Future Paradigms
(02:06:39) LLM Limitations and Capabilities
(02:13:24) Inner vs Outer Alignment
(02:19:28) AI Policy and Pause AI Discussion
(02:25:49) Lightning Round
(02:32:19) Closing Thoughts
---
First published:
August 5th, 2025
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
En liten tjänst av I'm With Friends. Finns även på engelska.