arXiv 2507.20534
Kimi K2: Open Agentic Intelligence
By Kimi Team, Yifan Bai, et al.
Published 2025-07-28
Citation lineage
Review the prior work and downstream research connected to this paper.
We introduce Kimi K2, a Mixture-of-Experts (MoE) large language model with 32 billion activated parameters and 1 trillion total parameters. We propose the MuonClip optimizer, which improves upon Muon with a novel QK-clip technique to address training instability while enjoying the advanced token efficiency of Muon. Based on MuonClip, K2 was pre-trained on 15.5 trillion tokens with zero loss spike. During post-traini…