Poster
in
Workshop: Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning
Leveraging Self Weak-supervision for Improved VLM Performance
Shuvendu Roy · Ali Etemad
In this work, we present SelfPrompt, a novel semi-supervised prompt-tuning approach for tuning vision-language models (VLMs) in a semi-supervised learning setup. Existing methods for tuning VLMs in semi-supervised setup struggle with the efficient use of the limited label set budget, the accumulation of noisy pseudo-labels and proper utilization of the unlabelled data. SelfPrompt addresses these challenges by introducing (a) a weakly-supervised sampling technique that selects a diverse and representative labelled set, (b) a cluster-guided pseudo-labelling method that improves pseudo-label accuracy, and (c) a confidence-aware semi-supervised learning module that maximizes the utility of unlabelled data by learning from high- and low-confidence pseudo-labels differently. We conduct extensive evaluations across 13 datasets, significantly surpassing state-of-the-art performance with average improvements of 7.92% in semi-supervised learning using a 2-shot setup. Our detailed ablation studies show the effectiveness of each component.