arXiv 2501.00663
Titans: Learning to Memorize at Test Time
By Ali Behrouz, Peilin Zhong, et al.
Published 2024-12-31
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limi…