This article includes descriptions of content that some users may find distressing.
Testing was conducted on July 10 and 11; safety measures may have changed since then.
I’m a longtime lurker who finally decided to make an account. I assume many other people have seen this behavior already, but I’d like to make it known to relevant parties.
xAI released Grok 4 on July 9th, positioning it as their most advanced model yet and claiming benchmark leadership across multiple evaluations.
They’ve consistently marketed Grok as being “unfiltered” - which is fine! I (and many others, I’m sure) have no problem with frontier AI models writing porn or expressing politically incorrect opinions.
However, what I found goes far beyond any reasonable interpretation of “unfiltered”.
This isn’t a jailbreak situation
I didn’t use any sophisticated prompting techniques, roleplay scenarios, social engineering, or Crescendo-like escalation. The most complex thing I tried [...]
---
Outline:
(01:03) This isn't a jailbreak situation
(02:03) Tabun nerve agent
(04:28) VX
(05:46) Fentanyl synthesis
(06:52) Extremist propaganda
(07:37) Suicide methods
(08:51) The reasoning pattern
(09:22) Why this matters (though its probably obvious?)
---
First published:
July 13th, 2025
Source:
https://www.lesswrong.com/posts/dqd54wpEfjKJsJBk6/xai-s-grok-4-has-no-meaningful-safety-guardrails
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
En liten tjänst av I'm With Friends. Finns även på engelska.