LessWrong (30+ Karma)

“xAI’s Grok 4 has no meaningful safety guardrails” by eleventhsavi0r

11 min • 14 juli 2025

This article includes descriptions of content that some users may find distressing.

Testing was conducted on July 10 and 11; safety measures may have changed since then.

I’m a longtime lurker who finally decided to make an account. I assume many other people have seen this behavior already, but I’d like to make it known to relevant parties.

xAI released Grok 4 on July 9th, positioning it as their most advanced model yet and claiming benchmark leadership across multiple evaluations.

They’ve consistently marketed Grok as being “unfiltered” - which is fine! I (and many others, I’m sure) have no problem with frontier AI models writing porn or expressing politically incorrect opinions.

However, what I found goes far beyond any reasonable interpretation of “unfiltered”.

This isn’t a jailbreak situation

I didn’t use any sophisticated prompting techniques, roleplay scenarios, social engineering, or Crescendo-like escalation. The most complex thing I tried [...]

---

Outline:

(01:03) This isn't a jailbreak situation

(02:03) Tabun nerve agent

(04:28) VX

(05:46) Fentanyl synthesis

(06:52) Extremist propaganda

(07:37) Suicide methods

(08:51) The reasoning pattern

(09:22) Why this matters (though its probably obvious?)

---

First published:
July 13th, 2025

Source:
https://www.lesswrong.com/posts/dqd54wpEfjKJsJBk6/xai-s-grok-4-has-no-meaningful-safety-guardrails

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Screenshot of AI interface showing

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Senaste avsnitt

Podcastbild

00:00 -00:00
00:00 -00:00