arXiv 2501.02086

Instruction-Following Pruning for Large Language Models

By Bairu Hou, Qibin Chen, et al.

Published 2025-01-03

Citation lineage

Review the prior work and downstream research connected to this paper.

With the rapid scaling of large language models (LLMs), structured pruning has become a widely used technique to learn efficient, smaller models from larger ones, delivering superior performance compared to training similarly sized models from scratch. In this paper, we move beyond the traditional static pruning approach of determining a fixed pruning mask for a model, and propose a dynamic approach to structured pr…

View the original paper on arXiv