arXiv 2501.02086

Instruction-Following Pruning for Large Language Models

By Bairu Hou, Qibin Chen, et al.

Published 2025-01-03

Discussion

Read the public discussion and references gathered around this paper.

With the rapid scaling of large language models (LLMs), structured pruning has become a widely used technique to learn efficient, smaller models from larger ones, delivering superior performance compared to training similarly sized models from scratch. In this paper, we move beyond the traditional static pruning approach of determining a fixed pruning mask for a model, and propose a dynamic approach to structured pr…

View the original paper on arXiv