arXiv 2511.00903

ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval

By Ahmed Masry, Megh Thakkar, et al.

Published 2025-11-02

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Retrieval-augmented generation has proven practical when models require specialized knowledge or access to the latest data. However, existing methods for multimodal document retrieval often replicate techniques developed for text-only retrieval, whether in how they encode documents, define training objectives, or compute similarity scores. To address these limitations, we present ColMate, a document retrieval model…

View the original paper on arXiv