Poster
in
Workshop: Fine-Tuning in Modern Machine Learning: Principles and Scalability
Efficient Fine-Tuning of CNN-based Foundation Models for Segmentation in 3D Medical Images
Mees Hudepohl · Suraj Pai · Heysem Kaya · Hugo Aerts
Medical imaging techniques like Computed Tomography (CT) are crucial for disease detection and treatment, with semantic segmentation being essential for accurate analysis. Despite the potential of deep learning models, particularly Convolutional Neural Networks (CNNs), for automated segmentation, the limited availability of labeled data in medical imaging remains an obstacle. To address this problem, foundation models have been introduced, which require fine-tuning to adapt to specific tasks. However, state-of-the-art methods like full fine-tuning are storage-intensive and prone to forgetting and overfitting. As a more efficient alternative, Parameter-Efficient Fine-Tuning (PEFT) techniques have been developed. Nevertheless, most PEFT research has been concentrated on transformer-based models applied to language or 2D natural images, leaving a gap in the application of these techniques to CNN-based models for 3D medical imaging. This study addresses the gap by applying the PEFT technique ConvAdapter to SegResNet, a CNN-based segmentation model, for segmenting organs in 3D CT images. The goal is to enhance performance while minimizing the number of tunable parameters. We demonstrate that integrating ConvAdapter within SegResNet achieves an effective balance between performance and parameter efficiency, yielding a Mean Dice score of 0.84 on the test set while only tuning 0.7M parameters - less than 15% of the total model parameters. ConvAdapter maintains performance trends similar to full fine-tuning and shows promising generalization across diverse datasets, even outperforming full fine-tuning on MR data. These findings highlight the potential of PEFT techniques in improving the efficiency of fine-tuning CNN models for medical imaging, particularly for complex tasks like 3D organ segmentation. By refining these techniques and exploring their integration with self-supervised foundation models, they hold promise for developing even more adaptable and efficient models.