Sveriges mest populära poddar
The Information Bottleneck

The Hidden Engine of Vision with Peyman Milanfar (Google)

1 tim 24 min10 april 2026

How Denoising Secretly Powers Everything in AI

Peyman Milanfar is a Distinguished Scientist at Google, leading its Computational Imaging team. He's a member of the National Academy of Engineering, an IEEE Fellow, and one of the key people behind the Pixel camera pipeline. Before Google, he was a professor at UC Santa Cruz for 15 years and helped build the imaging pipeline for Google Glass at Google X. Over 35,000 citations.

Peyman makes a provocative case that denoising, long dismissed as a boring cleanup task, is actually one of the most fundamental operations in modern ML, on par with SGD and backprop. Knowing how to remove noise from a signal basically means you have a map of the manifold that signals live on, and that insight connects everything from classical inverse problems to diffusion models.

We go from early patch-based denoisers to his 2010 "Is Denoising Dead?" paper, and then to the question that redirected his research: if denoising is nearly solved, what else can denoisers do? That led to Regularization by Denoising (RED), which, if you unroll it, looks a lot like a diffusion process, years before diffusion models existed. We also cover how his team shipped a one-step diffusion model on the Pixel phone for 100x ProRes Zoom, the perception-distortion-authenticity tradeoff in generative imaging, and a new paper on why diffusion models don't actually need noise conditioning. The conversation wraps with a debate on why language has dominated the AI spotlight while vision lags, and Peyman's argument that visual intelligence, grounded in physics and robotics, is coming next.

Timeline

0:00 Intro and Peyman's background

1:22 Why denoising matters more than you think Sensor diversity and Tesla's vision-only bet

15:04 BM3D and why it was secretly an MMSE estimator

17:02 "Is Denoising Dead?" then what else can denoisers do?

18:07 Plug-and-play methods and Regularization by Denoising (RED)

26:18 Denoising, manifolds, and the compression connection

28:12 Energy-based models vs. diffusion: "The Geometry of Noise"

31:40 Natural gradient descent and why flow models work

34:48 Gradient-free optimization and high-dimensional noise

45:13 Image quality and the perception-distortion tradeoff

48:39 Information theory, rate-distortion, and generative models

52:57 Denoising vs. editing

54:25 The changing role of theory

57:07 Hobbyist tools vs. shipping consumer products

59:40 Coding agents, vibe coding, and domain expertise

1:05:00 Vision and more complex-dimensional signals

1:09:31 Do models need to interact with the physical world?

1:11:28 Continual learning and novelty-driven updates

1:13:00 On-device learning and privacy

1:15:01 Why has language dominated AI? Is vision next?

1:17:14 How kids learn: vision first, language later

1:19:36 Academia vs. industry

1:22:28 10,000 citations vs. shipping to millions, why choose?

Music:

  • "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
  • "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
  • Changes: trimmed

About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

Fler avsnitt av The Information Bottleneck

Visa alla avsnitt av The Information Bottleneck

The Information Bottleneck med Ravid Shwartz-Ziv & Allen Roush finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.