arXiv 2202.05262

Locating and Editing Factual Associations in GPT

By Kevin Meng, David Bau, et al.

Published 2022-02-10

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model's factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate fac…

View the original paper on arXiv