arXiv 1712.09913

Visualizing the Loss Landscape of Neural Nets

By Hao Li, Zheng Xu, et al.

Published 2017-12-28

Discussion

Read the public discussion and references gathered around this paper.

Neural network training relies on our ability to find "good" minimizers of highly non-convex loss functions. It is well-known that certain network architecture designs (e.g., skip connections) produce loss functions that train easier, and well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers that generalize better. However, the reasons for these differences, and their effects on t…

View the original paper on arXiv