Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Mathematics of Modern Machine Learning (M3L)

An empirical study of the $(L_0, L_1)$-smoothness condition

Y Cooper

Keywords: [ optimization ] [ smoothness ] [ Deep learning ]


Abstract: The $(L_0,L_1)$-smoothness condition was introduced by Zhang-He-Sra-Jadbabai in 2020, who both proved convergence bounds under this assumption and provided empirical evidence it is satisfied in deep learning. Since then, many groups have proven convergence guarantees for functions which satisfy this condition, motivated by the expectation that loss functions arising in deep learning satisfy it. In this paper we provide further empirical study of this condition in the setting of feedforward neural networks of depth at least 2, with $L2$ or cross entropy loss. The results suggest that the $(L_0,L_1)$-smoothness condition is not satisfied in this setting.

Chat is not available.