Sveriges mest populära poddar
Intellectually Curious

Dion: Distributed Orthogonal Updates for Scalable AI Training

5 min4 augusti 2025
An exploration of the Dion optimizer (Distributed Orthogonal Updates) and how it tackles the scalability bottlenecks of training giant models. We break down why orthonormal updates matter, why Muon’s dense-matrix approach struggles with sharded, multi-GPU deployments, and how Dion uses amortized power iteration with QR and Cholesky on distributed shards to deliver fast, communication-efficient updates. Learn about integration with PyTorch DDP, FSDP2, and tensor parallelism, rank-fract compression with error feedback, and the empirical gains in wall-clock time over AdamW and Muon at scale—plus what this could unlock for the future of AI training.


Note:  This podcast was AI-generated, and sometimes AI can make mistakes.  Please double-check any critical information.

Sponsored by Embersilk LLC

Fler avsnitt av Intellectually Curious

Visa alla avsnitt av Intellectually Curious

Intellectually Curious med Mike Breault finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.