NeurIPS Poster Regularized Gradient Boosting

Poster

Regularized Gradient Boosting

Corinna Cortes · Mehryar Mohri · Dmitry Storcheus

East Exhibition Hall B, C #11

Keywords: [ Algorithms ] [ Boosting and Ensemble Methods ] [ Algorithms -> AutoML; Algorithms -> Large Scale Learning; Algorithms -> Meta-Learning; Theory -> Learning Theory; Theory ] [ Reg ]

[ Abstract ]

Abstract:

Gradient Boosting (\GB) is a popular and very successful ensemble method for binary trees. While various types of regularization of the base predictors are used with this algorithm, the theory that connects such regularizations with generalization guarantees is poorly understood. We fill this gap by deriving data-dependent learning guarantees for \GB\ used with \emph{regularization}, expressed in terms of the Rademacher complexities of the constrained families of base predictors. We introduce a new algorithm, called \rgb\, that directly benefits from these generalization bounds and that, at every boosting round, applies the \emph{Structural Risk Minimization} principle to search for a base predictor with the best empirical fit versus complexity trade-off. Inspired by \emph{Randomized Coordinate Descent} we provide a scalable implementation of our algorithm, able to search over large families of base predictors. Finally, we provide experimental results, demonstrating that our algorithm achieves significantly better out-of-sample performance on multiple datasets than the standard \GB\ algorithm used with its regularization.

Live content is unavailable. Log in and register to view live content