arXiv 2604.07729
Emotion Concepts and their Function in a Large Language Model
By Nicholas Sofroniew, Isaac Kauvar, et al.
Published 2026-04-09
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why this is the case in Claude Sonnet 4.5 and explore implications for alignment-relevant behavior. We find internal representations of emotion concepts, which encode the broad concept of a particular emotion and generalize across contexts and behaviors it might be linked to. These representations track the operative emotion…