arXiv 2510.07364
Base Models Know How to Reason, Thinking Models Learn When
By Constantin Venhoff, Iván Arcuschin, et al.
Published 2025-10-08
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Why do thinking language models like DeepSeek R1 outperform their base counterparts? Despite consistent performance gains, it remains unclear to what extent thinking models learn entirely new reasoning capabilities or repurpose pre-existing base model ones. In this work, we propose a hybrid model where we activate reasoning mechanisms in base models at the right time to elicit thinking-model-level reasoning chains,…