Sveriges mest populära poddar

Build Wiz AI Show

Catching AI Sleeper Agent - LLM Backdoors

16 min•5 februari 2026

Could your trusted AI model be a hidden "sleeper agent" just waiting for a secret command to turn malicious? We explore a new methodology that extracts and reconstructs backdoor triggers by exploiting the surprising fact that these models often strongly memorize their own poisoning data. Tune in to discover how this inference-only scanner can unmask hidden threats across various LLMs without needing any prior knowledge of the attacker’s specific trigger or target behavior.

Source: https://arxiv.org/pdf/2602.03085

Fler avsnitt av Build Wiz AI Show

Policy on the AI Exponential

11 juni•24 min

The Rise of Recursive Self-Improvement at Anthropic

5 juni•21 min

AlphaProof Nexus: Advancing Mathematics Research via AI Formal Proof Search

25 maj•21 min

Pi - and self-modifying AI Agents

22 maj•19 min

Code with Claude - London 2026

22 maj•24 min

Google I/O 2026 keynote

20 maj•23 min

The Langchain Agent Development Keynote 2026

20 maj•20 min

Building the Software Factory: From Code to Autonomy

19 maj•22 min

Spec-Driven Development and Agentic Workflows in 2026

15 maj•21 min

Efficient Pre-Training with Token Superposition

14 maj•22 min

Build Wiz AI Show med Build Wiz AI finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.