arXiv 2510.26692

Kimi Linear: An Expressive, Efficient Attention Architecture

By Kimi Team, Yu Zhang, et al.

Published 2025-10-30

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

We introduce Kimi Linear, a hybrid linear attention architecture that, for the first time, outperforms full attention under fair comparisons across various scenarios -- including short-context, long-context, and reinforcement learning (RL) scaling regimes. At its core lies Kimi Delta Attention (KDA), an expressive linear attention module that extends Gated DeltaNet with a finer-grained gating mechanism, enabling mor…

View the original paper on arXiv