arXiv 2602.03992

Nemotron ColEmbed V2: Top-Performing Late Interaction Embedding Models for Visual Document Retrieval

By Gabriel de Souza P. Moreira, Ronay Ak, et al.

Published 2026-02-03

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Retrieval-Augmented Generation (RAG) systems have been popular for generative applications, powering language models by injecting external knowledge. Companies have been trying to leverage their large catalog of documents (e.g. PDFs, presentation slides) in such RAG pipelines, whose first step is the retrieval component. Dense retrieval has been a popular approach, where embedding models are used to generate a dense…

View the original paper on arXiv