arXiv 2305.16934
On Evaluating Adversarial Robustness of Large Vision-Language Models
By Yunqing Zhao, Tianyu Pang, et al.
Published 2023-05-26
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Large vision-language models (VLMs) such as GPT-4 have achieved unprecedented performance in response generation, especially with visual inputs, enabling more creative and adaptable interaction than large language models such as ChatGPT. Nonetheless, multimodal generation exacerbates safety concerns, since adversaries may successfully evade the entire system by subtly manipulating the most vulnerable modality (e.g.,…