arXiv 2406.09519
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models
By Jack Merullo, Carsten Eickhoff, et al.
Published 2024-06-13
Citation lineage
Review the prior work and downstream research connected to this paper.
Although it is known that transformer language models (LMs) pass features from early layers to later layers, it is not well understood how this information is represented and routed by the model. We analyze a mechanism used in two LMs to selectively inhibit items in a context in one task, and find that it underlies a commonly used abstraction across many context-retrieval behaviors. Specifically, we find that models…