arXiv 2311.03099

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

By Le Yu, Bowen Yu, et al.

Published 2023-11-06

Citation lineage

Review the prior work and downstream research connected to this paper.

In this paper, we unveil that Language Models (LMs) can acquire new capabilities by assimilating parameters from homologous models without retraining or GPUs. We first introduce DARE to set most delta parameters (i.e., the disparity between fine-tuned and pre-trained parameters) to zeros without affecting the abilities of Supervised Fine-Tuning (SFT) LMs, which randomly Drops delta parameters with a ratio And REscal…

View the original paper on arXiv