arXiv 2604.01193

Embarrassingly Simple Self-Distillation Improves Code Generation

By Ruixiang Zhang, Richard He Bai, et al.

Published 2026-04-01

Discussion

Read the public discussion and references gathered around this paper.

Can a large language model (LLM) improve at code generation using only its own raw outputs, without a verifier, a teacher model, or reinforcement learning? We answer in the affirmative with simple self-distillation (SSD): sample solutions from the model with certain temperature and truncation configurations, then fine-tune on those samples with standard supervised fine-tuning. SSD improves Qwen3-30B-Instruct from 42…

View the original paper on arXiv