Sveriges mest populära poddar

Build Wiz AI Show

🛡️ Breaking Agent Backbones: Evaluating LLM Security in AI Agents

16 min•31 oktober 2025

Breaking Agent Backbones: AI agents are being deployed at scale, but their security is challenged by non-deterministic behavior and novel vulnerabilities. This episode introduces the "threat snapshot" framework and the new b3 benchmark, which systematically isolate and evaluate security risks stemming from the backbone LLM. We reveal crucial findings: enhanced reasoning capabilities generally improve security, yet model size does not correlate with lower vulnerability scores.

Fler avsnitt av Build Wiz AI Show

Policy on the AI Exponential

11 juni•24 min

The Rise of Recursive Self-Improvement at Anthropic

5 juni•21 min

AlphaProof Nexus: Advancing Mathematics Research via AI Formal Proof Search

25 maj•21 min

Pi - and self-modifying AI Agents

22 maj•19 min

Code with Claude - London 2026

22 maj•24 min

Google I/O 2026 keynote

20 maj•23 min

The Langchain Agent Development Keynote 2026

20 maj•20 min

Building the Software Factory: From Code to Autonomy

19 maj•22 min

Spec-Driven Development and Agentic Workflows in 2026

15 maj•21 min

Efficient Pre-Training with Token Superposition

14 maj•22 min

Build Wiz AI Show med Build Wiz AI finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.