arXiv 2106.09685

LoRA: Low-Rank Adaptation of Large Language Models

By Edward J. Hu, Yelong Shen, et al.

Published 2021-06-17

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We prop…

View the original paper on arXiv