LessWrong (30+ Karma)

“AI #118: Claude Ascendant” by Zvi

112 min • 29 maj 2025

The big news of this week was of course the release of Claude 4 Opus. I offered two review posts: One on safety and alignment, and one on mundane utility, and a bonus fun post on Google's Veo 3.

I am once again defaulting to Claude for most of my LLM needs, although I often will also check o3 and perhaps Gemini 2.5 Pro.

On the safety and alignment front, Anthropic did extensive testing, and reported that testing in an exhaustive model card. A lot of people got very upset to learn that Opus could, if pushed too hard in the wrong situations engineered for these results, do things like report your highly unethical actions to authorities or try to blackmail developers into not being shut down or replaced. It is good that we now know about these things, and it was quickly observed that similar behaviors [...]

---

Outline:

(01:23) Language Models Offer Mundane Utility

(08:54) Now With Extra Glaze

(15:54) Get My Agent On The Line

(17:03) Language Models Don't Offer Mundane Utility

(22:49) Huh, Upgrades

(26:42) On Your Marks

(27:35) Choose Your Fighter

(33:40) Deepfaketown and Botpocalypse Soon

(37:51) Fun With Media Generation

(38:21) Playing The Training Data Game

(38:38) They Took Our Jobs

(46:51) The Art of Learning

(49:10) The Art of the Jailbreak

(49:49) Unprompted Attention

(50:44) Get Involved

(51:33) Introducing

(51:52) In Other AI News

(52:45) Show Me the Money

(57:08) Nvidia Sells Out

(01:03:14) Quiet Speculations

(01:06:16) The Quest for Sane Regulations

(01:18:18) The Week in Audio

(01:20:13) Rhetorical Innovation

(01:34:29) Board of Anthropic

(01:37:08) Misaligned!

(01:39:22) Aligning a Smarter Than Human Intelligence is Difficult

(01:40:21) Americans Do Not Like AI

(01:42:37) People Are Worried About AI Killing Everyone

(01:44:09) Other People Are Not As Worried About AI Killing Everyone

(01:46:01) The Lighter Side

---

First published:
May 29th, 2025

Source:
https://www.lesswrong.com/posts/9THq9RvpbmecWa6Ni/ai-118-claude-ascendant

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Freedom scores table showing UAE's low ratings for world and internet.
Bar graph comparing AI regulation concerns between U.S. adults and AI experts.
A doctor in white coat next to two facial expressions, smiling and serious.
Circular diagram showing NSF grant funding distribution across academic disciplines in 2025.
Bar graph comparing AI attitudes between U.S. adults and AI experts.
FRED graph showing employment trends for full-time bank tellers since 2002.
MacOS settings window showing binary code and
Two graphs showing AI manipulation capability versus population impact and growth over time.
A bearded man in a black hat pointing dramatically, with
Text message conversation discussing ChatGPT and
Graph showing ChatGPT's daily usage minutes from May 2023 to April 2025.

The chart shows a significant upward trend from about 5 minutes to 18 minutes per day, with a 194% increase noted.
Terminal window showing error messages and
Graph comparing AI experts' and public's outlook on AI's future impact. Title:
Text announcement about OpenAI launching Stargate UAE and international AI infrastructure partnership.
Bar graph comparing
Text excerpt discussing risks of applying Netflix business strategies to AI systems, highlighting five main concerns:
-
xlr8harder tweets:
Jonas Vollmer tweets:
xlr8harder tweets:
This appears to be a conversation between a user and GPT-4, discussing a profound interaction about AI alignment and the concept of treating AI as
Graph showing gender differences in AI experts' views on AI impact. Three key metrics compared between men and women experts.
Humorous photoshopped gorillas playing chess, wearing monocle and binoculars.
Graph showing confidence levels in AI regulation by government and industry sectors. From Pew Research Center.
Black t-shirt with text about preferring to talk to Claude instead of AI
Bar graph comparing AI experts' confidence levels between academic and industry sectors. Shows data about responsible AI development and government regulation.
Black t-shirt with white text
The image shows a gorilla contemplating a chessboard, presented in both photographic and artistic renditions. In both versions, the gorilla is holding a wooden chess board with pieces arranged on it while sitting in a lush, green jungle environment. The left image appears to be a photograph, while the right is an artistic illustration with similar composition and lighting.

The prompt at the top requests
Yellow warning triangle with black exclamation mark.
Yellow warning triangle with black exclamation mark.
Yellow warning triangle with black exclamation mark.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Senaste avsnitt

Podcastbild

00:00 -00:00
00:00 -00:00