arXiv 2305.16934
On Evaluating Adversarial Robustness of Large Vision-Language Models
By Yunqing Zhao, Tianyu Pang, et al.
Published 2023-05-26
Citation lineage
Review the prior work and downstream research connected to this paper.
Large vision-language models (VLMs) such as GPT-4 have achieved unprecedented performance in response generation, especially with visual inputs, enabling more creative and adaptable interaction than large language models such as ChatGPT. Nonetheless, multimodal generation exacerbates safety concerns, since adversaries may successfully evade the entire system by subtly manipulating the most vulnerable modality (e.g.,…