Sveriges mest populära poddar
Embodied AI 101

Qwen-VLA: A Generalist Vision–Language–Action Robot Model

36 min29 maj 2026
A single generalist VLA built on Qwen3.5-4B + 1.15B DiT flow-matching action decoder that unifies manipulation, navigation, and trajectory prediction across 11 embodiments via text-described embodiment prompts. Trained in four stages and outperforms task-specific specialists on real ALOHA and sim benchmarks without per-task fine-tuning.

Embodied AI 101 med Shaoqing Tan finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.