arXiv 2605.10878

Neural Weight Norm = Kolmogorov Complexity

By Tiberiu Musat

Published 2026-05-11

Discussion

Read the public discussion and references gathered around this paper.

Why does weight decay work? We prove that, in any fixed-precision regime, the smallest weight norm of a looped neural network outputting a binary string equals the Kolmogorov complexity of that string, up to a logarithmic factor. This implies that weight decay induces a prior matching Solomonoff's universal prior, the optimal prior over computable functions, up to a polynomial factor. The result is norm-agnostic: in…

View the original paper on arXiv