arXiv 2012.09816
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
By Zeyuan Allen-Zhu and Yuanzhi Li
Published 2020-12-17
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
We formally study how ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model using knowledge distillation. We consider the challenging case where the ensemble is simply an average of the outputs of a few independently trained neural networks with the SAME architecture, trained using the SAME algorithm on the SAME data set, and the…