arXiv 2605.10878
Neural Weight Norm = Kolmogorov Complexity
By Tiberiu Musat
Published 2026-05-11
Discussion
Read the public discussion and references gathered around this paper.
Why does weight decay work? We prove that, in any fixed-precision regime, the smallest weight norm of a looped neural network outputting a binary string equals the Kolmogorov complexity of that string, up to a logarithmic factor. This implies that weight decay induces a prior matching Solomonoff's universal prior, the optimal prior over computable functions, up to a polynomial factor. The result is norm-agnostic: in…