Poster
in
Workshop: OPT 2023: Optimization for Machine Learning
Variance Reduced Model Based Methods: New rates and adaptive step sizes
Robert Gower · Frederik Kunstner · Mark Schmidt
Abstract:
Variance reduced gradients methods were introduced to control the variance of SGD (Stochastic Gradient Descent). Model-based methods are able to make use of a known lower bound on the loss, for instance, most loss functions are positive. We show how these two classes of methods can be seamlessly combined. As an example we present a Model-based Stochastic Average Gradient method MSAG, which results from using a truncated model together with the SAG method. At each iteration MSAG computes an adaptive learning rate based on a given known lower bound. When given access to the optimal objective as the lower bound, MSAG has several favorable convergence properties, including monotonic iterates, and convergence in the non-smooth, smooth and strongly convex setting. This shows that we can essentially trade-off knowing the smoothness constant $L_{\max}$ for knowing the optimal objective to achieve the favourable convergence of variance reduced gradient methods.
Chat is not available.