Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Medical Imaging meets NeurIPS

M3-X: Multimodal Generative Model for Screening Mammogram Reading and Explanation

Man Luo · Amara Tariq · Bhavik Patel · Imon Banerjee


Abstract:

FDA (The United States Food and Drug Administration) approved multiple automated mammogram image reading models, but most of the models lack interpretability. Efforts have been made to interpret the model's decision through saliency maps or GradCAMs~\cite{selvaraju2017grad} that highlight the model’s attention on specific areas with the image. While technically sounds, these interpretability maps may not be well perceived by radiologists due to ambiguity and uncertainty of the findings. As such, we hypothesize that in addition to deriving the diagnosis, a text-based semantic explanation of a model’s attention (similar to findings documented in radiology reports) may be more readily understandable by humans and therefore may serve as a better trust-able component of an AI model. Therefore, the purpose of our study was to develop a transformer-based multi-modal generative model for the automatic interpretation of screening mammogram studies and the generation of text-based reasoning. Experimental results from our tests using the X-Institution\footnote{For anonymity concern, we will use X-Institution to not reveal the identity.} mammogram screening dataset demonstrate that our model significantly outperforms the baselines in both accuracy and the quality of explanations.

Chat is not available.