Talk
in
Workshop: OPT2020: Optimization for Machine Learning
Invited speaker: The Convexity of Learning Infinite-width Deep Neural Networks, Tong Zhang
Tong Zhang
Abstract:
Deep learning has received considerable empirical successes in recent years. Although deep neural networks (DNNs) are highly nonconvex with respect to the model parameters, it has been observed that the training of overparametrized DNNs leads to consistent solutions that are highly reproducible with different random initializations.
I will explain this phenomenon by modeling DNNs using feature representations, and show that the optimization landscape is convex with respect to the features. Moreover, we show that optimization with respect to the nonconvex DNN parameters leads to a global optimal solution under an idealized regularity condition, which can explain various empirical findings.