A deep dive into Simple Self-Distillation (SSD): how large language models can improve by training on their own unverified outputs with zero external supervision. We unpack the Precision Exploration Conflict, the roles of locks (need for precision) and forks (creative exploration), and how SSD reshapes token distributions to sharpen precision while preserving exploration. We review the Quinn 330B Instruct results on LiveCodeBench (notable ~30% relative gains and stronger improvements on hard problems) and discuss the surprising finding that even data with gibberish can help models learn the geometry of problem-solving. Finally, we consider what latent capabilities might be unlocked when models learn from their own guesses and what this could mean for AI-assisted problem solving.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC
Fler avsnitt av Intellectually Curious
Visa alla avsnitt av Intellectually CuriousIntellectually Curious med Mike Breault finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
