Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Machine Learning and Compression

Mind the Gap Between Synthetic and Real: Probing Transfer Capabilities of Stable Diffusion Images

Leonhard Hennicke · Christian Medeiros Adriano · Holger Giese · Jan Koehler · Lukas Schott


Abstract:

Generative foundation models like Stable Diffusion comprise a diverse spectrum of knowledge in computer vision and are multiple orders of magnitude smaller in storage-size compared with large datasets. Nonetheless, they hold a potential for transfer learning, e.g., via generating data to train student models for downstream tasks, thereby presenting a form of data-free knowledge distillation based on a flexible, implicit dataset compressed into a neural network. However, the resultant student models show a significant drop in accuracy compared with models trained on real data. We investigate possible causes for this drop and focus on the role of the different layers of the student model. By training these layers using either real or synthetic data, we reveal that the drop mainly stems from the model's final layers. Further, we briefly investigate other factors, such as differences in data-normalization between synthetic and real, the impact of data augmentations, texture versus shape learning, and assuming oracle prompts. While we find that some of those factors can have an impact, they are not sufficient to close the performance gap towards real data. Building upon our insights that mainly later layers are responsible for the drop, we investigate the data-efficiency of fine-tuning a synthetically trained model with real data applied to only those last layers. Our results suggest an improved trade-off between the amount of training with real data and the model's accuracy. Our findings contribute to the understanding of the performance gap between training with synthetic and real data while indicating solutions to mitigate the scarcity of labeled real data.

Chat is not available.