Poster
in
Workshop: Fine-Tuning in Modern Machine Learning: Principles and Scalability
Investigating the Role of Fine-Tuning in Addressing the Gap Between Synthetic and Real Data in Generative Foundation Models
Leonhard Hennicke · Christian Medeiros Adriano · Holger Giese · Lukas Schott · Jan Koehler
Generative foundation models like Stable Diffusion have shown potential for transfer learning in computer vision, particularly in training student models for downstream tasks using generated data. However, these student models often exhibit a significant drop in accuracy compared to models trained on real data. In this research, we investigate the causes of this drop, focusing on the role of different layers in the student model. Our findings reveal that the drop mainly stems from the model's final layers. Building upon our insights, we investigate the data-efficiency of fine-tuning a synthetically trained model with real data applied to only the last layers. Our results suggest an improved trade-off between the amount of real training data used for fine-tuning and the model's accuracy, contributing to the understanding of the gap between synthetic and real data and indicating potential solutions to mitigate the scarcity of labeled real data.