Plenary speaker
in
Workshop: OPT 2023: Optimization for Machine Learning
DoG is SGD’s best friend: toward tuning-free stochastic optimization, Yair Carmon
Yair Carmon
Abstract:
Abstract: While stochastic optimization methods drive continual improvements in machine learning, choosing the optimization parameters—and particularly the learning rate (LR)—remains a difficulty. In this talk, I will describe our work on removing LR tuning from stochastic gradient descent (SGD), culminating in a tuning-free dynamic SGD step size formula, which we call Distance over Gradients (DoG). We show that DoG removes the need to tune learning rate both theoretically (obtaining strong parameter-free convergence guarantees) and empirically (performing nearly as well as expensively-tuned SGD on neural network training tasks).
Chat is not available.