arXiv 2511.00903
ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval
By Ahmed Masry, Megh Thakkar, et al.
Published 2025-11-02
Discussion
Read the public discussion and references gathered around this paper.
Retrieval-augmented generation has proven practical when models require specialized knowledge or access to the latest data. However, existing methods for multimodal document retrieval often replicate techniques developed for text-only retrieval, whether in how they encode documents, define training objectives, or compute similarity scores. To address these limitations, we present ColMate, a document retrieval model…