arXiv 2410.09982
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
By Vithursan Thangarasa, Ganesh Venkatesh, et al.
Published 2024-10-13
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Large language models have driven significant progress in natural language processing, but their deployment requires substantial compute and memory resources. As models scale, compression techniques become essential for balancing model quality with computational efficiency. Structured pruning, which removes less critical components of the model, is a promising strategy for reducing complexity. However, one-shot prun…