arXiv 2510.15068
Sequential Comics for Jailbreaking Multimodal Large Language Models via Structured Visual Storytelling
By Deyue Zhang, Dongdong Yang, et al.
Published 2025-10-16
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Multimodal large language models (MLLMs) exhibit remarkable capabilities but remain susceptible to jailbreak attacks exploiting cross-modal vulnerabilities. In this work, we introduce a novel method that leverages sequential comic-style visual narratives to circumvent safety alignments in state-of-the-art MLLMs. Our method decomposes malicious queries into visually innocuous storytelling elements using an auxiliary…