Giancarlo Kerg (Google Scholar) is a PhD student at Mila, supervised by Yoshua Bengio and Guillaume Lajoie. He is working on out-of-distribution generalization and modularity in memory-augmented neural networks.
Highlights from our conversation:
🧮 Pure math foundations as an approach to progress and structural understanding in deep learning research
🧠 How a formal proof on the way self-attention mitigates gradient vanishing when capturing long-term dependencies in RNNs led to a relevancy screening mechanism resembling human memory consolidation
🎯 Out-of-distribution generalization through modularity and inductive biases
Fler avsnitt av Generally Intelligent
Visa alla avsnitt av Generally IntelligentGenerally Intelligent med Kanjun Qiu finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
