NeurIPS PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data

Spotlight Talk
in
Workshop: Order up! The Benefits of Higher-Order Optimization in Machine Learning

PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data

Farshed Abdukhakimov

[ Abstract ]

Abstract:

The family of Stochastic Gradient Methods with Polyak Step-size offers an update rule that alleviates the need of fine-tuning the learning rate of an optimizer. Recent work (Robert M Gower, Mathieu Blondel, Nidham Gazagnadou, and Fabian Pedregosa: Cutting some slack for SGD with adaptive polyak stepsizes) has been proposed to introduce a slack variable, which makes these methods applicable outside of the interpolation regime. In this paper, we combine preconditioning and slack in an updated optimization algorithm to show its performance on badly scaled and/or ill-conditioned datasets. We use Hutchinson's method to obtain an estimate of a Hessian which is used as the preconditioner.

Chat is not available.

Spotlight Talk in Workshop: Order up! The Benefits of Higher-Order Optimization in Machine Learning

PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data

Farshed Abdukhakimov

Spotlight Talk
in
Workshop: Order up! The Benefits of Higher-Order Optimization in Machine Learning