arXiv 2511.15304
Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
By Piercosma Bisconti, Matteo Prandi, et al.
Published 2025-11-19
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for Large Language Models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%. Mapping prompts to MLCommons and EU CoP risk taxonomies shows that poetic attacks transfer across CBRN, manipulation, cyber-offenc…