arXiv 2602.17835

Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning

By Sirui Chen, Yunzhe Qi, et al.

Published 2026-02-19

Discussion

Read the public discussion and references gathered around this paper.

Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational cost scales poorly, making them impractical for multi-billion-parameter large language models (LLMs). A common alternative is to use off-the-she…

View the original paper on arXiv