NeurIPS Adversarially-robust representation learning through spectral regularization of features

Poster
in
Workshop: Symmetry and Geometry in Neural Representations

Adversarially-robust representation learning through spectral regularization of features

Sheng Yang · Jacob Zavatone-Veth · Cengiz Pehlevan

Keywords: [ pretraining ] [ representational geometry ] [ Adversarial robustness ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The vulnerability of neural network classifiers to adversarial attacks is a major obstacle to their deployment in safety-critical applications. Regularization of network parameters during training can be used to improve adversarial robustness and generalization performance. Usually, the network is regularized end-to-end, with parameters at all layers affected by regularization. However, in settings where learning representations is key, such as self-supervised learning (SSL), layers after the feature representation will be discarded when performing inference. For these models, regularizing up to the feature space is more suitable. To this end, we propose a new spectral regularizer for representation learning that encourages black-box adversarial robustness in downstream classification tasks.

Chat is not available.

Poster in Workshop: Symmetry and Geometry in Neural Representations

Adversarially-robust representation learning through spectral regularization of features

Sheng Yang · Jacob Zavatone-Veth · Cengiz Pehlevan

Poster
in
Workshop: Symmetry and Geometry in Neural Representations