arXiv 2106.09685
LoRA: Low-Rank Adaptation of Large Language Models
By Edward J. Hu, Yelong Shen, et al.
Published 2021-06-17
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We prop…