arXiv 2604.07729
Emotion Concepts and their Function in a Large Language Model
By Nicholas Sofroniew, Isaac Kauvar, et al.
Published 2026-04-09
Citation lineage
Review the prior work and downstream research connected to this paper.
Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why this is the case in Claude Sonnet 4.5 and explore implications for alignment-relevant behavior. We find internal representations of emotion concepts, which encode the broad concept of a particular emotion and generalize across contexts and behaviors it might be linked to. These representations track the operative emotion…