arXiv 2311.03099
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
By Le Yu, Bowen Yu, et al.
Published 2023-11-06
Discussion
Read the public discussion and references gathered around this paper.
In this paper, we unveil that Language Models (LMs) can acquire new capabilities by assimilating parameters from homologous models without retraining or GPUs. We first introduce DARE to set most delta parameters (i.e., the disparity between fine-tuned and pre-trained parameters) to zeros without affecting the abilities of Supervised Fine-Tuning (SFT) LMs, which randomly Drops delta parameters with a ratio And REscal…