NeurIPS Debiasing Global Workspace: A Cognitive Neural Framework for Learning Debiased and Interpretable Representations

Poster
in
Workshop: Workshop on Behavioral Machine Learning

Debiasing Global Workspace: A Cognitive Neural Framework for Learning Debiased and Interpretable Representations

Jinyung Hong · Eun Som Jeon · Changhoon Kim · Keun Hee Park · Utkarsh Nath · 'YZ' Yezhou Yang · Pavan Turaga · Theodore P. Pavlic

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Deep Neural Networks (DNNs) often make predictions based on "spurious" attributes when trained on biased datasets, where most samples have features spuriously correlated with the target labels. This can be problematic if irrelevant features are easier for the model to learn than the truly relevant ones. Existing debiasing methods require predefined bias labels and entail computational complexity with additional networks. We propose an alternative approach inspired by cognitive science, called Debiasing Global Workspace (DGW). DGW consists of specialized modules and a shared workspace, allowing for increased modularity and improved debiasing performance. Additionally, our method enhances the transparency of decision-making processes through attention masks. We validate DGW across various biased datasets, proving its effectiveness in better debiasing performance.

Chat is not available.

Poster in Workshop: Workshop on Behavioral Machine Learning

Debiasing Global Workspace: A Cognitive Neural Framework for Learning Debiased and Interpretable Representations

Jinyung Hong · Eun Som Jeon · Changhoon Kim · Keun Hee Park · Utkarsh Nath · 'YZ' Yezhou Yang · Pavan Turaga · Theodore P. Pavlic

Poster
in
Workshop: Workshop on Behavioral Machine Learning