Sveriges mest populära poddar

Tool Use - AI Conversations

When AI Benchmarks Lie: A Better Way to Evaluate Ft. Chris Hay

58 min•26 november 2024

This episode explores the world of AI evaluation, with insights from Chris Hay on why benchmarks are "stupid" and how to effectively evaluate AI models. Get the tools pip install tool-use-ai Check out Chris' Channel https://www.youtube.com/@chrishayuk Links https://github.com/EleutherAI/lm-eval... Lessons from the Trenches on Reproducible Evaluation of Language Models - https://arxiv.org/pdf/2405.14782

https://github.com/confident-ai/deepeval Connect with us https://x.com/ToolUseAI

https://x.com/MikeBirdTech

https://x.com/FieroTy

https://x.com/chrishayuk *The opinions of Chris are purely Chris's opinions and don't represent the opinions of his employer

Fler avsnitt av Tool Use - AI Conversations

Hermes Agent has won. Here's why

14 apr.•35 min

How To Make Your Websites Fully Autonomous (ft rtrvr)

17 mars•42 min

How To Make Your A.I. Product Go Viral (ft Mano Tsiris)

10 mars•47 min

How To Build a Hybrid AI System with Any-LLM (ft Nathan Brake)

3 mars•44 min

AI Sovereignty - Control Your Entire AI architecture (ft Max McCrea)

24 feb.•1 tim 6 min

Do You Need A Vector Database in 2026? (ft Arjun Patel)

17 feb.•59 min

Fine-Tune Your Own A.I. Video Model (ft. Greg Schoeninger)

10 feb.•43 min

From Marketer to Growth Engineer Using AI (ft Justin Borge)

3 feb.•43 min

Ryan Carson Explains The Ralph Wiggum Loop

27 jan.•50 min

Advanced Claude Code Part 2 (ft Eric Buess)

20 jan.•52 min

Tool Use - AI Conversations med Mike Bird finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.