Poster
in
Workshop: Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning
Exploring Visual Prompt Tuning for Demographic Adaptation in Foundation Models for Medical Imaging
Artur Parkhimchyk · Amirreza Naziri · Laleh Seyyed-Kalantari
Pre-trained medical foundation models are large, and they require significant computational resources for training. Visual Prompt Tuning (VPT) allows foundation models to efficiently adapt to new tasks with minimal changes to the model's architecture, reducing the need for extensive fine-tuning. Here, we explore demographic (race) adaptation of foundation models (MAE and MoCoV3) for disease classification in medical imaging using naturally imbalanced data. We compare three adaptation strategies: linear probing, full fine-tuning, and VPT. We find that VPT obtains a clear boost in performance, starting with prompt length 5 over linear probing. In the case of race demographics (e.g. Asian with 5.7\% of the full dataset), a VPT model trained on a demographic (Asian) performed similarly to a fully fine-tuned model trained on same dateset. A fully fine-tuned foundation model on a diverse and large dataset performs better than a model adapted only for a specific subset of data. However, it needs large data and computing resources, which may not always be available. These findings show that VPT can efficiently adapt foundation models for small datasets, achieving performance comparable to full fine-tuning.