arXiv 2202.05262
Locating and Editing Factual Associations in GPT
By Kevin Meng, David Bau, et al.
Published 2022-02-10
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model's factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate fac…