arXiv 2504.13169

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

By Tsung-Han Wu, Heekyung Lee, et al.

Published 2025-04-17

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Vision-Language Models (VLMs) excel at visual understanding but often suffer from visual hallucinations, where they generate descriptions of nonexistent objects, actions, or concepts, posing significant risks in safety-critical applications. Existing hallucination mitigation methods typically follow one of two paradigms: generation adjustment, which modifies decoding behavior to align text with visual inputs, and po…

View the original paper on arXiv