Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Medical Imaging meets NeurIPS

Unveiling the Interplay Between Interpretability and Generative Performance in Medical Diffusion Models

Mischa Dombrowski · Hadrien Reynaud · Johanna Paula Müller · Matthew Baugh · Bernhard Kainz


Abstract:

Generative diffusion models are showing promising utility in medical imaging, particularly in synthesizing high-quality images like MRI scans and 4D data. However, despite advancements in multi-modal models that leverage both textual and visual information, a significant gap exists in understanding the trade-off between image generation quality and model interpretability. In this paper, we investigate this issue by fine-tuning a Stable Diffusion v2 model with a focus on text-image embeddings. Specifically, we assess the impact of keeping the language encoder frozen during the fine-tuning process. We show that freezing the language encoder significantly improves the interpretability of the generated images without compromising on quality. Through extensive evaluation on MS-COCO for in-domain training and MIMIC-CXR for out-of-domain data, we demonstrate that our approach outperforms existing benchmarks specifically trained for localization in terms of localization capabilities and generative quality across multiple disease classes. This study serves as a foundational step towards the development of high-performing yet interpretable generative models in medical imaging, addressing a critical need for effective and responsible AI adoption in healthcare.

Chat is not available.