arXiv 2509.14252

LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures

By Hai Huang, Yann LeCun, et al.

Published 2025-09-11

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Large Language Model (LLM) pretraining, finetuning, and evaluation rely on input-space reconstruction and generative capabilities. Yet, it has been observed in vision that embedding-space training objectives, e.g., with Joint Embedding Predictive Architectures (JEPAs), are far superior to their input-space counterpart. That mismatch in how training is achieved between language and vision opens up a natural question:…

View the original paper on arXiv