arXiv 2202.05262

Locating and Editing Factual Associations in GPT

By Kevin Meng, David Bau, et al.

Published 2022-02-10

Discussion

Read the public discussion and references gathered around this paper.

We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model's factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate fac…

View the original paper on arXiv