arXiv 2410.19750
The Geometry of Concepts: Sparse Autoencoder Feature Structure
By Yuxiao Li, Eric J. Michaud, et al.
Published 2024-10-10
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Sparse autoencoders have recently produced dictionaries of high-dimensional vectors corresponding to the universe of concepts represented by large language models. We find that this concept universe has interesting structure at three levels: 1) The "atomic" small-scale structure contains "crystals" whose faces are parallelograms or trapezoids, generalizing well-known examples such as (man-woman-king-queen). We find…