arXiv 2509.14252
LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures
By Hai Huang, Yann LeCun, et al.
Published 2025-09-11
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Large Language Model (LLM) pretraining, finetuning, and evaluation rely on input-space reconstruction and generative capabilities. Yet, it has been observed in vision that embedding-space training objectives, e.g., with Joint Embedding Predictive Architectures (JEPAs), are far superior to their input-space counterpart. That mismatch in how training is achieved between language and vision opens up a natural question:…