Poster
in
Workshop: OPT 2023: Optimization for Machine Learning
Optimization dependent generalization bound for ReLU networks based on sensitivity in the tangent bundle
Dániel Rácz · Mihaly Petreczky · Balint Daroczy
Recent advances in deep learning have given us some verypromising results on the generalization ability of deep neural networks, howeverliterature still lacks a comprehensive theory explaining why heavilyover-parametrized models are able to generalize well while fitting the trainingdata. In this paper we propose a PAC type bound on the generalization error offeedforward ReLU networks via estimating the Rademacher complexity of the set ofnetworks available from an initial parameter vector via gradient descent. Thekey idea is to bound the sensitivity of the network's gradient to perturbationof the input data along the optimization trajectory. The obtained bound doesnot explicitly depend on the depth of the network. Our results areexperimentally verified on the MNIST and CIFAR-10 datasets.