arXiv 2601.10387
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
By Christina Lu, Jack Gallagher, et al.
Published 2026-01-15
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Large language models can represent a variety of personas but typically default to a helpful Assistant identity cultivated during post-training. We investigate the structure of the space of model personas by extracting activation directions corresponding to diverse character archetypes. Across several different models, we find that the leading component of this persona space is an "Assistant Axis," which captures th…