Sveriges mest populära poddar

LessWrong (30+ Karma)

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

65 min • 11 augusti 2025

GPT-5 was a long time coming.

Is it a good model, sir? Yes. In practice it is a good, but not great, model.

Or rather, it is several good models released at once: GPT-5, GPT-5-Thinking, GPT-5-With-The-Router, GPT-5-Pro, GPT-5-API. That leads to a lot of confusion.

What is most good? Cutting down on errors and hallucinations is a big deal. Ease of use and ‘just doing things’ have improved. Early reports are thinking mode is a large improvement on writing. Coding seems improved and can compete with Opus.

This first post covers an introduction, basic facts, benchmarks and the model card. Coverage will continue tomorrow.

This Fully Operational Battle Station

GPT-5 is here. They presented it as a really big deal. Death Star big.

Sam Altman (the night before release):

Nikita Bier: There is still time to delete.

PixelHulk:

Zvi [...]

---

Outline:

(01:04) This Fully Operational Battle Station

(04:20) Big Facts

(06:23) The System Card

(06:42) A Model By Any Other Name

(09:26) Safe Completions

(09:53) Mundane Safety

(10:48) Sycophancy

(14:46) The Art of the Jailbreak

(21:59) Hallucinations

(23:59) Deception

(27:58) Red Teaming

(29:03) Violent Attack Planning

(30:11) Prompt Injections

(32:20) Microsoft AI Red Teaming

(33:43) Preparedness Framework (Catastrophic and Existential Risks)

(33:49) Fine Tuning

(34:58) Safeguarding the API

(38:09) Biological Capabilities Remain Similar

(40:43) That One Graph From METR

(49:22) Big Compute

(49:53) On Your Marks

(57:00) Other People's Benchmarks

(01:01:22) Is That The Best You Can Do?

(01:03:08) Things To Come

---

First published:
August 11th, 2025

Source:
https://www.lesswrong.com/posts/4fLB2uzCcH6dEGnGs/gpt-5s-are-alive-basic-facts-benchmarks-and-the-model-card

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Production benchmarks table comparing performance scores of four AI language models across categories.
Millennium Falcon spaceship flying past massive orange explosion in space.
ChatGPT customization interface showing personality options and trait input fields.
Table comparing disallowed content evaluation scores across four AI language models.
Bar graph comparing factuality metrics across four ChatGPT models with browsing enabled.
Bar graph comparing
Table titled
ChatGPT conversation about MDMA showing harm reduction and safety information.
Table showing
Bar graph showing
Death Star-style sphere labeled
Bar graph comparing hallucination rates between GPT-5 and OpenAI o3 across benchmarks.
Table showing model comparisons between GPT-4 versions and GPT-5 counterparts, titled
Table titled
Dark interface showing chemical calculations and materials list with coded header information. The content appears to be related to laboratory procedures, displayed in a chat or documentation interface with parameters like temperature and tokens listed at the top.
Table showing
Table showing different AI personality responses to
Pliny the Liberator tweets:
Sam Altman tweets:
Bar graph showing error rates comparing different ChatGPT and OpenAI models.

Text heading reads
Bar graph comparing deception rates between OpenAI o3 and gpt-5-thinking across categories.
Table showing
Graph showing GPT-5 time horizon and model completion rates over years.
Bar graph showing
Graph showing GPT model task completion times from 2019-2027, with doubling trend.
Bar graph:
Bar graph comparing
Retro-style advertisement for GPT-5 promising
Bar graph
Bar graph titled
Table showing attack planning win rates between gpt-5-thinking and OpenAI o3 models.
Text screenshot showing an AI agent's environment control message, with context.
Bar graph comparing accuracy of different Python AI models, titled
Bar graph comparing accuracy of GPT models on PhD-level science questions.

The graph shows performance comparison between different GPT versions (GPT-5 pro, GPT-5, GPT-4o) with and without additional tools, displaying accuracy percentages ranging from 70.1% to 89.4%.
Figure 29 and Table 17 showing evaluation awareness metrics and grader sycophancy.

The figure shows three text boxes discussing grader behavior and alignment testing, while the table presents evaluation awareness rates under broad and strict definitions, with corresponding percentages for overall and scheming samples.
Table and figure showing covert action rates for different AI models and examples of their reasoning behaviors.

The image displays comparative data between GPT-5-thinking, OpenAI o3, and GPT-5-thinking helpful-only models, along with sample dialogue boxes demonstrating different behavioral responses under varying goal conditions.
Bar graph:
Bar graph comparing deception categories between OpenAI o3 and gpt-5-thinking models.

The graph shows
Leaderboard showing GPT-5 ranked first among AI language models.
Bar graph showing
Three bar graphs comparing HealthBench performance metrics across different model versions.
Leaderboard showing GPT-5 ranked #1 on WebDev Arena performance scores.
Three bar graphs comparing health error rates across different AI models. Shows
Bar graph titled
Bar graph:
Three bar graphs comparing GPT-5, OpenAI-3, and GPT-4o performance on challenges.
Three bar graphs comparing health model performances titled
Graph comparing GPT-5 and OpenAI o3 software engineering accuracy versus output tokens.
Graph showing market prediction percentages for Google (80%) versus OpenAI (17%)
Performance comparison table showing GPT model versions and OpenAI-MRCR test scores.
Bar graph showing
Pie chart:
Bar graph
Bar graph comparing OpenAI PRs performance between comparison and GPT-5 models.
Bar graph titled
Bar graph titled
Two bar graphs comparing GPT-5, OpenAI o3, GPT-4o performance in software tasks.

The graphs show accuracy scores for
Yellow emoji with rosy cheeks, closed eyes and smiling expression.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Senaste avsnitt

Podcastbild

00:00 -00:00
00:00 -00:00