arXiv 2505.15045

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

By Siyue Zhang, Yilun Zhao, et al.

Published 2025-05-21

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Large language model (LLM)-based embedding models, benefiting from large scale pre-training and post-training, have begun to surpass BERT and T5-based models on general-purpose text embedding tasks such as document retrieval. However, a fundamental limitation of LLM embeddings lies in the unidirectional attention used during autoregressive pre-training, which misaligns with the bidirectional nature of text embedding…

View the original paper on arXiv