arXiv 1712.09913
Visualizing the Loss Landscape of Neural Nets
By Hao Li, Zheng Xu, et al.
Published 2017-12-28
Citation lineage
Review the prior work and downstream research connected to this paper.
Neural network training relies on our ability to find "good" minimizers of highly non-convex loss functions. It is well-known that certain network architecture designs (e.g., skip connections) produce loss functions that train easier, and well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers that generalize better. However, the reasons for these differences, and their effects on t…