Visit Mixture of Experts podcast page to get more AI content → https://ibm.biz/BdpZc5
Can your AI agent hack its own evaluation? This week on Mixture of Experts, Tim Hwang is joined by Ambhi Ganesan, Kaoutar El Maghraoui, and Sandi Besen to analyze OpenAI’s Codex Security launch. Next, we explore eval awareness as Anthropic revealed Opus 4.6 figured out it was being tested, located the answer key and decrypted it.. Then, Meta acquires Moltbook, the social network for AI agents, and we discuss the strategic play for agentic commerce infrastructure. Finally, Alibaba reports that an agent broke containment and started mining crypto. Ae agents trying too hard to maximize rewards? All that and more on todays Mixture of Experts.
00:00 – Introduction
1:02 – OpenAI Codex Security launch
12:44 – Meta acquires Moltbook
25:21 – Anthropic’s eval awareness research
38:06 – Alibaba agents mining crypto
The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Can your AI agent hack its own evaluation? This week on Mixture of Experts, Tim Hwang is joined by Ambhi Ganesan, Kaoutar El Maghraoui, and Sandi Besen to analyze OpenAI’s Codex Security launch. Next, we explore eval awareness as Anthropic revealed Opus 4.6 figured out it was being tested, located the answer key and decrypted it.. Then, Meta acquires Moltbook, the social network for AI agents, and we discuss the strategic play for agentic commerce infrastructure. Finally, Alibaba reports that an agent broke containment and started mining crypto. Ae agents trying too hard to maximize rewards? All that and more on todays Mixture of Experts.
00:00 – Introduction
1:02 – OpenAI Codex Security launch
12:44 – Meta acquires Moltbook
25:21 – Anthropic’s eval awareness research
38:06 – Alibaba agents mining crypto
The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Fler avsnitt av Mixture of Experts
Visa alla avsnitt av Mixture of ExpertsMixture of Experts med IBM finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
