LessWrong (30+ Karma)

“Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity” by habryka

12 min • 11 juli 2025

METR released a new paper with very interesting results on developer productivity effects from AI. I have copied their blog post here in full.

We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity of experienced open-source developers working on their own repositories. Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower. We view this result as a snapshot of early-2025 AI capabilities in one relevant setting; as these systems continue to rapidly evolve, we plan on continuing to use this methodology to help estimate AI acceleration from AI R&D automation [1].

See the full paper for more detail.

Motivation

While coding/agentic benchmarks [2] have proven useful for understanding AI capabilities, they typically sacrifice realism for scale and efficiency—the tasks are self-contained, don’t require prior context to understand, and use algorithmic evaluation [...]

---

Outline:

(01:23) Motivation

(02:39) Methodology

(03:56) Core Result

(05:15) Factor Analysis

(06:12) Discussion

(11:08) Going Forward

---

First published:
July 11th, 2025

Source:
https://www.lesswrong.com/posts/9eizzh3gtcRvWipq8/measuring-the-impact-of-early-2025-ai-on-experienced-open

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Graph showing model capabilities compared across RCT, benchmarks, and anecdotes.
Bar graph comparing model capabilities across
Graph showing
Graph showing model capabilities compared to RCT, benchmarks, and anecdotes
Software development workflow diagram showing AI-allowed and AI-disallowed development paths.
Bar graph comparing AI-allowed vs AI-disallowed developer forecasts and actual implementation times.
Table showing five factors affecting AI development in software repositories.

The table lists observations about over-optimism, developer familiarity, repository complexity, reliability, and context challenges.
Graph showing AI's impact on developer speed, comparing forecasts versus reality.

The graph compares various expert predictions about AI's effect on developer productivity against the actual observed results, with an interesting contrast between expected speedup and actual slowdown.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Senaste avsnitt

Podcastbild

00:00 -00:00
00:00 -00:00