arXiv 2511.00903

ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval

By Ahmed Masry, Megh Thakkar, et al.

Published 2025-11-02

Citation lineage

Review the prior work and downstream research connected to this paper.

Retrieval-augmented generation has proven practical when models require specialized knowledge or access to the latest data. However, existing methods for multimodal document retrieval often replicate techniques developed for text-only retrieval, whether in how they encode documents, define training objectives, or compute similarity scores. To address these limitations, we present ColMate, a document retrieval model…

View the original paper on arXiv