arXiv 2507.20534

Kimi K2: Open Agentic Intelligence

By Kimi Team, Yifan Bai, et al.

Published 2025-07-28

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

We introduce Kimi K2, a Mixture-of-Experts (MoE) large language model with 32 billion activated parameters and 1 trillion total parameters. We propose the MuonClip optimizer, which improves upon Muon with a novel QK-clip technique to address training instability while enjoying the advanced token efficiency of Muon. Based on MuonClip, K2 was pre-trained on 15.5 trillion tokens with zero loss spike. During post-traini…

View the original paper on arXiv