Poster
in
Workshop: Bridging the Gap: from Machine Learning Research to Clinical Practice
Harmonizing Attention: Attention Map Consistency For Unsupervised Fine-Tuning
Ali Mirzazadeh · Florian Dubost · Daniel Fu · Khaled Saab · Christopher Lee-Messer · Daniel Rubin
Learning meaningful representations is challenging when the training data is scarce. Attention maps can be used to verify that a model learned the target representations. Those representations should match human understanding, be generalizable to unseen data, and not focus on potential bias in the dataset. Attention maps are designed to highlight regions of the model’s input that were discriminative for its predictions. However, different attention maps computation methods often highlight different regions of the input, with sometimes contradictory explanations for a prediction. This effect is exacerbated when the training set is small. This indicates that either the model learned incorrect representations or that the attention maps methods did not accurately estimate the model’s representations. We propose an unsupervised fine-tuning method that optimizes the consistency of attention maps and show that it improves both classification performance and the quality of attention maps. We propose an implementation for two state-of-the-art attention computation methods, Grad-CAM and Guided Backpropagation, which relies on an input masking technique. We evaluate this method on our own dataset of event detection in continuous video recordings of hospital patients aggregated and curated for this work. As a sanity check, we also evaluate the proposed method on PASCAL VOC. On the video data, we show that the method can be combined with SimCLR, a state-of-the-art self-supervised training method, to further improve classification performance. With the proposed method, we achieve a 6.6 points lift of F1 score over SimCLR alone for classification on our video dataset, a 2.9 point lift of F1 score over ResNet for classification on