Poster
in
Workshop: GenAI for Health: Potential, Trust and Policy Compliance
Demo: Harnessing Generative AI for Comprehensive Evaluation of Medical Imaging AI
Yisak Kim · Seunghyun Jang · Soyeon Kim · Kyungmin Jeon · Chang Min Park
Keywords: [ Diffusion ] [ Interpretability ] [ Synthetic Data ] [ Benchmark ] [ Evaluation ]
Evaluating AI models in the field of medical imaging, particularly for tasks such as nodule detection, is a challenging endeavor due to the scarcity of large, diverse, and well-annotated datasets. These constraints hinder the ability to accurately assess model performance and limit generalization to real-world scenarios. To address these challenges, we introduce SynNodBench, a generative AI-based demo that generates synthetic lung nodules of varying sizes, shapes, and locations on chest X-rays. Our method employs a diffusion-based inpainting model trained on the NODE21 dataset, allowing for the creation of realistic and customizable synthetic nodules.We conducted multiple experiments to illustrate the utility of the demo in understanding nodule detection model behavior. First, by generating a large-scale synthetic test set, we were able to identify a positive correlation between nodule size and model confidence, a relationship that was not observed with smaller real-world datasets. We also demonstrated how the number of nodules in an image influences detection sensitivity, finding that the presence of additional nodules can increase sensitivity in detecting otherwise missed lesions. In another experiment, we examined whether a nodule detection model would correctly ignore nodules in anatomically impossible regions, such as air-leak areas, and confirmed the model’s robustness in these cases. Our findings show that using synthetic data provides a scalable and effective solution for evaluating AI models in healthcare.