NeurIPS An empirical study of the $(L_0, L

Poster
in
Workshop: Mathematics of Modern Machine Learning (M3L)

An empirical study of the $(L_0, L_1)$-smoothness condition

Y Cooper

Keywords: [ optimization ] [ smoothness ] [ Deep learning ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: The $(L_0,L_1)$-smoothness condition was introduced by Zhang-He-Sra-Jadbabai in 2020, who both proved convergence bounds under this assumption and provided empirical evidence it is satisfied in deep learning. Since then, many groups have proven convergence guarantees for functions which satisfy this condition, motivated by the expectation that loss functions arising in deep learning satisfy it. In this paper we provide further empirical study of this condition in the setting of feedforward neural networks of depth at least 2, with $L2$ or cross entropy loss. The results suggest that the $(L_0,L_1)$-smoothness condition is not satisfied in this setting.

Chat is not available.

Poster in Workshop: Mathematics of Modern Machine Learning (M3L)

An empirical study of the $(L_0, L_1)$-smoothness condition

Y Cooper

Poster
in
Workshop: Mathematics of Modern Machine Learning (M3L)