We unpack DeepMind's Gemini ER 1.6, an embodied reasoning model that grounds language in physical space with precise pointing, multi-camera success checks, and agentic action. See how its 'frontal lobe' plans tools and tasks, writes on-the-fly code to measure dial angles, and coordinates with 'VLA' muscle models to safely operate in messy environments—from reading gauges to Spot inspections. We'll explore the architecture, grounding techniques, safety constraints, and what this means for the future of autonomous robots and AI training.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC
Fler avsnitt av Intellectually Curious
Visa alla avsnitt av Intellectually CuriousIntellectually Curious med Mike Breault finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
