Matt Steiner, VP of Monetization Infrastructure, Ranking & AI Foundations at Meta, walks through how Meta's ad system actually works, and why the infrastructure behind it differs from what you'd build for LLMs.
We cover Andromeda (retrieval on a custom NVIDIA Grace Hopper SKU Meta co-designed), Lattice (consolidating N ranking models into one), GEM (Meta's Generative Ads Recommendation foundation model), and the adaptive ranking model, a roughly one-trillion-parameter recommender served at sub-second latency.
We get into why recommender workloads aren't embarrassingly parallel like LLMs (the "personalization blob"), what that means for Meta's MTIA custom silicon roadmap, and how LLM-written kernels (KernelEvolve) flipped the economics of running a heterogeneous hardware fleet. Demand for software engineering has actually gone up as the price has come down. Meta now wants ~100x more optimized kernels per chip.
Read the full transcript at https://www.chipstrat.com/p/an-interview-with-meta-vp-matt-steiner
Chapters:
0:00 Intro and scale
0:39 How Meta's ad system works
2:00 Meta Andromeda and the custom NVIDIA SKU
3:30 Lattice: consolidating ranking models
5:00 GEM, Meta's ads foundation model
6:30 Adaptive ranking for power users
8:17 The scale: 3B DAUs at sub-second latency
9:40 Why longer interaction histories matter
10:45 The anniversary gift analogy
12:57 A decade of compute evolution
15:21 Meta's infra as a CP-SAT problem
16:07 Co-designing Grace Hopper with NVIDIA
17:47 Matching compute shape to workload
18:26 Influencing hardware and software roadmaps
20:23 MTIA: why ads aren't LLMs
22:07 The personalization blob and I/O ratios
26:38 One trillion parameters at sub-second latency
28:26 Heterogeneous hardware trade-offs
29:30 KernelEvolve: LLMs writing custom kernels
33:30 GenAI and recommender systems cross-pollination
35:21 The 2-year infrastructure outlook
37:00 Why demand for software engineering is rising
38:53 How Matt stays on top of it all
Relevant reading:
KernelEvolve (Meta Engineering): https://engineering.fb.com/2026/04/02/developer-tools/kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure/
Follow Chipstrat:
Newsletter: https://www.chipstrat.com
X: https://x.com/chipstrat
Fler avsnitt av Semi Doped
Visa alla avsnitt av Semi DopedSemi Doped med Vikram Sekar and Austin Lyons finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
