arXiv 2511.15304
Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
By Piercosma Bisconti, Matteo Prandi, et al.
Published 2025-11-19
Citation lineage
Review the prior work and downstream research connected to this paper.
We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for Large Language Models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%. Mapping prompts to MLCommons and EU CoP risk taxonomies shows that poetic attacks transfer across CBRN, manipulation, cyber-offenc…