Poster
in
Workshop: OPT 2023: Optimization for Machine Learning
MSL: An Adaptive Momentem-based Stochastic Line-search Framework
Chen Fan · Sharan Vaswani · Christos Thrampoulidis · Mark Schmidt
Various adaptive step sizes have been proposed recently to reduce the amount of tedious manualtuning. A popular example is back-tracking line-search based on a stochastic Armijo condition.But the success of this strategy relies crucially on the search direction being a descent direction.Importantly, this condition is violated by both SGD with momentum (SGDM) and Adam, which arecommon choices in deep-net training. Adaptively choosing the step size in this setting is thus non-trivial and less explored despite its practical relevance. In this work, we propose two frameworks,namely, momentum correction and restart, that allow the use of stochastic line-search in conjunctionwith a generalized Armijo condition, and apply them to both SGDM and Adam. We empiricallyverify that the proposed algorithms are robust to the choice of the momentum parameter and otherhyperparameters.