arXiv 2202.05262
Locating and Editing Factual Associations in GPT
By Kevin Meng, David Bau, et al.
Published 2022-02-10
Discussion
Read the public discussion and references gathered around this paper.
We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model's factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate fac…