arXiv 1711.02257
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
By Zhao Chen, Vijay Badrinarayanan, et al.
Published 2017-11-07
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Deep multitask networks, in which one neural network produces multiple predictive outputs, can offer better speed and performance than their single-task counterparts but are challenging to train properly. We present a gradient normalization (GradNorm) algorithm that automatically balances training in deep multitask models by dynamically tuning gradient magnitudes. We show that for various network architectures, for…