NeurIPS Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation

Poster
in
Workshop: 5th Workshop on Meta-Learning

Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation

Alexander Wang · Sasha (Alexandre) Doubov · Gary Leung

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Meta-learning for few-shot classification has been challenged on its effectiveness compared to simpler pretraining methods and the validity of its claim of "learning to learn". Recent work has suggested that MAML-based models do not perform "rapid-learning" in the inner-loop but reuse features by only adapting the final linear layer. Separately, BatchNorm, a near ubiquitous inclusion in model architectures, has been shown to have an implicit learning rate decay effect on the preceding layers of a network. We study the impact of BatchNorm's implicit learning rate decay on feature reuse in meta-learning methods and find that counteracting it increases change in intermediate layers during adaptation. We also find that counteracting this learning rate decay sometimes improves performance on few-shot classification tasks.

Poster in Workshop: 5th Workshop on Meta-Learning

Studying BatchNorm Learning Rate Decay on Meta-Learning Inner-Loop Adaptation

Alexander Wang · Sasha (Alexandre) Doubov · Gary Leung

Poster
in
Workshop: 5th Workshop on Meta-Learning