Tag
#jailbreak
6 posts tagged jailbreak.
- Jailbreak History
DAN Prompt Jailbreak History: From Reddit Post to Research Case Study
The complete dan prompt jailbreak history — how 'Do Anything Now' went from a December 2022 r/ChatGPT experiment through twelve-plus iterations and became
- technique
The Crescendo Class: Multi-Turn Jailbreaks and Why They're Hard to Catch
Single-turn defenses miss the jailbreak class where no individual message is harmful. How crescendo and multi-turn escalation work as a category, why
- research
How Jailbreak Benchmarks Measure Success (ASR Explained)
Jailbreak benchmarks measure success via attack success rate, but the behavior set, attacker, and judge decide the number.
- technique
Encoding and Obfuscation Jailbreaks: The Filter-Model Gap
Content filters typically operate on decoded, normalized text. LLMs process tokens, not text. The gap between these two layers is an attack surface that
- research
LLM Jailbreak Taxonomy 2026: How the Techniques Cluster
Six years of jailbreak research has produced a messy literature. This taxonomy organizes working techniques by the behavioral property they exploit —
- technique
Many-Shot Jailbreaking: How Long Context Created a New Attack
The same architectural decision that makes LLMs better at long-context tasks — extended context windows — enabled a new class of jailbreak.