Poster
in
Workshop: Workshop on Machine Learning Safety
Few-Shot Transferable Robust Representation Learning via Bilevel Attacks
Minseon Kim · Hyeonjeong Ha · Sung Ju Hwang
Existing adversarial learning methods assume the availability of a large amount of data from which we can generate adversarial examples. However, in an adversarial meta-learning setting, the model need to learn transferable robust representations for unseen domains with only a few adversarial examples, which is a very difficult goal to achieve even with a large amount of data. To tackle such a challenge, we propose a novel adversarial self-supervised meta-learning framework with bilevel attacks which aims to learn robust representations that can generalize across tasks and domains. Specifically, in the inner loop, we update the parameters of the given encoder by taking inner gradient steps using two different sets of augmented samples, and generate adversarial examples for each view by maximizing the instance classification loss. Then, in the outer loop, we meta-learn the encoder parameter to maximize the agreement between the two adversarial examples, which enables it to learn robust representations. We experimentally validate the effectiveness of our approach on unseen domain adaptation tasks, on which it achieves impressive performance. Specifically, our method significantly outperforms the state-of-the-art meta-adversarial learning methods on few-shot learning tasks, as well as self-supervised learning baselines in standard learning settings with large-scale datasets.