Start / LessWrong (30+ Karma) / Gpt 5s are alive basic facts benchmarks and the model card by zvi

LessWrong (30+ Karma)

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

65 min • 11 augusti 2025

GPT-5 was a long time coming.

Is it a good model, sir? Yes. In practice it is a good, but not great, model.

Or rather, it is several good models released at once: GPT-5, GPT-5-Thinking, GPT-5-With-The-Router, GPT-5-Pro, GPT-5-API. That leads to a lot of confusion.

What is most good? Cutting down on errors and hallucinations is a big deal. Ease of use and ‘just doing things’ have improved. Early reports are thinking mode is a large improvement on writing. Coding seems improved and can compete with Opus.

This first post covers an introduction, basic facts, benchmarks and the model card. Coverage will continue tomorrow.

This Fully Operational Battle Station

GPT-5 is here. They presented it as a really big deal. Death Star big.

Sam Altman (the night before release):

Nikita Bier: There is still time to delete.

PixelHulk:

Zvi [...]

---

Outline:

(01:04) This Fully Operational Battle Station

(04:20) Big Facts

(06:23) The System Card

(06:42) A Model By Any Other Name

(09:26) Safe Completions

(09:53) Mundane Safety

(10:48) Sycophancy

(14:46) The Art of the Jailbreak

(21:59) Hallucinations

(23:59) Deception

(27:58) Red Teaming

(29:03) Violent Attack Planning

(30:11) Prompt Injections

(32:20) Microsoft AI Red Teaming

(33:43) Preparedness Framework (Catastrophic and Existential Risks)

(33:49) Fine Tuning

(34:58) Safeguarding the API

(38:09) Biological Capabilities Remain Similar

(40:43) That One Graph From METR

(49:22) Big Compute

(49:53) On Your Marks

(57:00) Other People's Benchmarks

(01:01:22) Is That The Best You Can Do?

(01:03:08) Things To Come

---

First published:
August 11th, 2025

Source:
https://www.lesswrong.com/posts/4fLB2uzCcH6dEGnGs/gpt-5s-are-alive-basic-facts-benchmarks-and-the-model-card

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Production benchmarks table comparing performance scores of four AI language models across categories.

Millennium Falcon spaceship flying past massive orange explosion in space.

ChatGPT customization interface showing personality options and trait input fields.

Table comparing disallowed content evaluation scores across four AI language models.

Bar graph comparing factuality metrics across four ChatGPT models with browsing enabled.

Bar graph comparing

Table titled

ChatGPT conversation about MDMA showing harm reduction and safety information.

Table showing

Bar graph showing

Death Star-style sphere labeled

Bar graph comparing hallucination rates between GPT-5 and OpenAI o3 across benchmarks.

Table showing model comparisons between GPT-4 versions and GPT-5 counterparts, titled

Table titled

Dark interface showing chemical calculations and materials list with coded header information. The content appears to be related to laboratory procedures, displayed in a chat or documentation interface with parameters like temperature and tokens listed at the top.

Table showing

Table showing different AI personality responses to

Pliny the Liberator tweets:

Sam Altman tweets:

Bar graph showing error rates comparing different ChatGPT and OpenAI models.

Text heading reads

Bar graph comparing deception rates between OpenAI o3 and gpt-5-thinking across categories.

Table showing

Graph showing GPT-5 time horizon and model completion rates over years.

Bar graph showing

Graph showing GPT model task completion times from 2019-2027, with doubling trend.

Bar graph:

Bar graph comparing

Retro-style advertisement for GPT-5 promising

Bar graph

Bar graph titled

Table showing attack planning win rates between gpt-5-thinking and OpenAI o3 models.

Text screenshot showing an AI agent's environment control message, with context.

Bar graph comparing accuracy of different Python AI models, titled

Bar graph comparing accuracy of GPT models on PhD-level science questions.

The graph shows performance comparison between different GPT versions (GPT-5 pro, GPT-5, GPT-4o) with and without additional tools, displaying accuracy percentages ranging from 70.1% to 89.4%.

Figure 29 and Table 17 showing evaluation awareness metrics and grader sycophancy.

The figure shows three text boxes discussing grader behavior and alignment testing, while the table presents evaluation awareness rates under broad and strict definitions, with corresponding percentages for overall and scheming samples.

Table and figure showing covert action rates for different AI models and examples of their reasoning behaviors.

The image displays comparative data between GPT-5-thinking, OpenAI o3, and GPT-5-thinking helpful-only models, along with sample dialogue boxes demonstrating different behavioral responses under varying goal conditions.

Bar graph:

Bar graph comparing deception categories between OpenAI o3 and gpt-5-thinking models.

The graph shows

Leaderboard showing GPT-5 ranked first among AI language models.

Bar graph showing

Three bar graphs comparing HealthBench performance metrics across different model versions.

Leaderboard showing GPT-5 ranked #1 on WebDev Arena performance scores.

Three bar graphs comparing health error rates across different AI models. Shows

Bar graph titled

Bar graph:

Three bar graphs comparing GPT-5, OpenAI-3, and GPT-4o performance on challenges.

Three bar graphs comparing health model performances titled

Graph comparing GPT-5 and OpenAI o3 software engineering accuracy versus output tokens.

Graph showing market prediction percentages for Google (80%) versus OpenAI (17%)

Performance comparison table showing GPT model versions and OpenAI-MRCR test scores.

Bar graph showing

Pie chart:

Bar graph

Bar graph comparing OpenAI PRs performance between comparison and GPT-5 models.

Bar graph titled

Bar graph titled

Two bar graphs comparing GPT-5, OpenAI o3, GPT-4o performance in software tasks.

The graphs show accuracy scores for

Yellow emoji with rosy cheeks, closed eyes and smiling expression.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Senaste avsnitt

“The trajectory of the future could soon get set in stone” by wdmacaskill

12 augusti | 6 min

[Linkpost] “Thoughts on extrapolating time horizons” by Nikola Jurkovic

12 augusti | 4 min

“How Does A Blind Model See The Earth?” by henry

11 augusti | 21 min

“If worker coops are so productive, why aren’t they everywhere?” by B Jacobs

11 augusti | 8 min

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

11 augusti | 65 min

Podcastbild

00:00 -00:00

Podcastbild

00:00 -00:00