Skip to yearly menu bar Skip to main content


Poster
in
Workshop: UniReps: Unifying Representations in Neural Models

Small-scale adversarial perturbations expose differences between predictive encoding models of human fMRI responses

Nikolas McNeal · Mainak Deb · N Apurva Ratan Murty

Keywords: [ vision ] [ encoding models ] [ model sensitivity ] [ adversarial attacks ]


Abstract:

Artificial neural network-based vision encoding models have made significant strides in predicting neural responses and providing insights into visual cognition. However, progress appears to be slowing, with many encoding models achieving similar levels of accuracy in predicting brain activity. In this study, we show that encoding models of human fMRI responses are highly vulnerable to small-scale adversarial attacks, revealing differences not captured using predictive accuracy alone. We then test adversarial sensitivity as a complementary evaluation measure and show that it offers a more effective way to distinguish between highly predictive encoding models. While explicit adversarial training can increase robustness of encoding models, we find that it comes at the cost of brain prediction accuracy. Our preliminary findings also indicate that the choice of model features-to-brain mapping might play a role in optimizing both robustness and accuracy, with sparse mappings typically resulting in more robust encoding models of neural activity. These findings reveal key vulnerabilities of current models, introduce a novel evaluation procedure, and offer a path toward improving the balance between robustness and predictive accuracy for future encoding models.

Chat is not available.