Sveriges mest populära poddar

Thinking Machines: AI & Philosophy

Samhälle och kultur Teknologi

LLM Inference Speed (Tech Deep Dive)

40 min•6 oktober 2023

In this tech talk, we dive deep into the technical specifics around LLM inference.

The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?

We jump into:

Is fast model inference the real moat for LLM companies?
What are the implications of slow model inference on the future of decentralized and edge model inference?
As demand rises, what will the latency/throughput tradeoff look like?
What innovations on the horizon might massively speed up model inference?

Fler avsnitt av Thinking Machines: AI & Philosophy

AI Therapy: An Open Conversation with Therapists

30 sep. 2025•1 tim 1 min

AI Therapy with Slingshot's Derrick Hull

17 mars 2025•36 min

What if we could cure loneliness? Philosophy, dopamine, and more with Mark Ungless

26 feb. 2025•1 tim 12 min

Does Philosophy Make Progress? Chatting with Every's Dan Shipper

23 jan. 2025•51 min

OpenAI o1: Another GPT-3 moment?

18 okt. 2024•52 min

The Future is Fine Tuned (with Dev Rishi, Predibase)

24 maj 2024•52 min

Pre-training LLMs: One Model To Rule Them All? with Talfan Evans, DeepMind

18 maj 2024•38 min

On Adversarial Training & Robustness with Bhavna Gopal

8 maj 2024•44 min

On Emotionally Intelligent AI (with Chris Gagne, Hume AI)

19 apr. 2024•40 min

Why Greatness Cannot Be Planned (with Joel Lehman)

22 mars 2024•47 min

Thinking Machines: AI & Philosophy med Daniel Reid Cahn finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.