arXiv 2510.27254
Languages are Modalities: Cross-Lingual Alignment via Encoder Injection
By Rajan Agarwal and Aarush Gupta
Published 2025-10-31
Citation lineage
Review the prior work and downstream research connected to this paper.
Instruction-tuned Large Language Models (LLMs) underperform on low resource, non-Latin scripts due to tokenizer fragmentation and weak cross-lingual coupling. We present LLINK (Latent Language Injection for Non-English Knowledge), a compute efficient language-as-modality method that conditions an instruction-tuned decoder without changing the tokenizer or retraining the decoder. First, we align sentence embeddings f…