A shooting formulation of deep learning
François-Xavier Vialard, Roland Kwitt, Susan Wei, Marc Niethammer
Oral presentation: Orals & Spotlights Track 28: Deep Learning
on 2020-12-10T06:00:00-08:00 - 2020-12-10T06:15:00-08:00
on 2020-12-10T06:00:00-08:00 - 2020-12-10T06:15:00-08:00
Poster Session 6 (more posters)
on 2020-12-10T09:00:00-08:00 - 2020-12-10T11:00:00-08:00
GatherTown: Deep Learning ( Town E0 - Spot B2 )
on 2020-12-10T09:00:00-08:00 - 2020-12-10T11:00:00-08:00
GatherTown: Deep Learning ( Town E0 - Spot B2 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Toggle Abstract Paper (in Proceedings / .pdf)
Abstract: A residual network may be regarded as a discretization of an ordinary differential equation (ODE) which, in the limit of time discretization, defines a continuous-depth network. Although important steps have been taken to realize the advantages of such continuous formulations, most current techniques assume identical layers. Indeed, existing works throw into relief the myriad difficulties of learning an infinite-dimensional parameter in a continuous-depth neural network. To this end, we introduce a shooting formulation which shifts the perspective from parameterizing a network layer-by-layer to parameterizing over optimal networks described only by a set of initial conditions. For scalability, we propose a novel particle-ensemble parameterization which fully specifies the optimal weight trajectory of the continuous-depth neural network. Our experiments show that our particle-ensemble shooting formulation can achieve competitive performance. Finally, though the current work is inspired by continuous-depth neural networks, the particle-ensemble shooting formulation also applies to discrete-time networks and may lead to a new fertile area of research in deep learning parameterization.