arXiv 2309.07315

Traveling Words: A Geometric Interpretation of Transformers

By Raul Molina

Published 2023-09-13

Citation lineage

Review the prior work and downstream research connected to this paper.

Transformers have significantly advanced the field of natural language processing, but comprehending their internal mechanisms remains a challenge. In this paper, we introduce a novel geometric perspective that elucidates the inner mechanisms of transformer operations. Our primary contribution is illustrating how layer normalization confines the latent features to a hyper-sphere, subsequently enabling attention to m…

View the original paper on arXiv