arXiv 2306.08818
Pragmatic Inference with a CLIP Listener for Contrastive Captioning
By Jiefu Ou, Benno Krojer, et al.
Published 2023-06-15
Citation lineage
Review the prior work and downstream research connected to this paper.
We propose a simple yet effective and robust method for contrastive captioning: generating discriminative captions that distinguish target images from very similar alternative distractor images. Our approach is built on a pragmatic inference procedure that formulates captioning as a reference game between a speaker, which produces possible captions describing the target, and a listener, which selects the target give…