Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Medical Imaging meets NeurIPS

Temporal Fine-tuning of Medical Vision-Language Representation

Haoxu Huang · Kyunghyun Cho · Sumit Chopra · Divyam Madaan


Abstract:

Despite the abundant data sources available in biomedical applications, existing machine learning models fail to effectively harness these resources for patient diagnosis. This work focuses on visual and textual data formats, which are often used to pre-train multimodal representations; however, the final diagnosis is based solely on the fine-tuned image encoder. To address this constraint, we introduce a novel framework designed to leverage temporal information obtained from previous medical image examinations and their associated reports during fine-tuning. Our evaluation, conducted on the MIMIC dataset with newly proposed temporal data generation process, demonstrates an average improvement of up to 3.89% compared to using only image data for diagnosis.

Chat is not available.