arXiv 2504.01818

Efficient Constant-Space Multi-Vector Retrieval

By Sean MacAvaney, Antonio Mallia, et al.

Published 2025-04-02

Discussion

Read the public discussion and references gathered around this paper.

Multi-vector retrieval methods, exemplified by the ColBERT architecture, have shown substantial promise for retrieval by providing strong trade-offs in terms of retrieval latency and effectiveness. However, they come at a high cost in terms of storage since a (potentially compressed) vector needs to be stored for every token in the input collection. To overcome this issue, we propose encoding documents to a fixed nu…

View the original paper on arXiv