NeurIPS Neural Processes with Stochastic Attention: Paying more attention to the context dataset

Poster
in
Workshop: 5th Workshop on Meta-Learning

Neural Processes with Stochastic Attention: Paying more attention to the context dataset

Mingyu Kim · KyeongRyeol Go · Se-Young Yun

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Neural processes (NPs) aim to stochastically complete unseen data points based on a given context dataset. NPs essentially leverage a given dataset as a context embedding to derive an identifier suitable for a novel task. To improve the prediction accuracy, many variants of NPs have investigated context embedding approaches that generally design novel network architectures and aggregation functions satisfying permutation invariant. This paper proposes a stochastic attention mechanism for NPs to capture appropriate context information. From the perspective of information theory, we demonstrate that the proposed method encourages context embedding to be differentiated from a target dataset. The differentiated information induces NPs to learn to derive appropriate identifiers by considering together context embeddings and features in a target dataset. We empirically show that our approach substantially outperforms various conventional NPs in 1D regression and lotka-Volterra problem as well as image completion. Plus, we observe that the proposed method maintains performance and captures context embedding under restricted task distributions, where typical NPs suffer from lack of effective tasks to learn context embeddings. The proposed method achieves comparable results with state-of-the-art methods in the MovieLens-10k dataset, one of the real-world problems with limited users, perform well for the image completion task even with very limited meta-training dataset.

Poster in Workshop: 5th Workshop on Meta-Learning

Neural Processes with Stochastic Attention: Paying more attention to the context dataset

Mingyu Kim · KyeongRyeol Go · Se-Young Yun

Poster
in
Workshop: 5th Workshop on Meta-Learning