Poster
in
Workshop: 5th Workshop on Self-Supervised Learning: Theory and Practice
Self Supervised Learning Using Controlled Diffusion Image Augmentation
Judah Goldfeder · Patrick Puma · Gabriel Guo · Gabriel Trigo · Hod Lipson
While synthetic data generated through diffusion models has been shown to improve performance across various tasks, existing approaches face two challenges: the necessity of fine-tuning a diffusion model for a specific dataset is often expensive, and the domain gap between real and synthetic data limits synthetic data's usefulness, especially in fine-grained classification settings. To mitigate these shortcomings, we developed CDaug, a novel approach to data augmentation utilizing controlled diffusion. Instead of utilizing diffusion models to generate wholly new images, we take a self-supervised approach and condition the generated images on existing data, allowing us to create high quality synthetic images/augmentations that capture the semantic priors and underlying structure of the data while infusing meaningful and novel variations with no human intervention. We developed a pipeline that utilizes ControlNet, conditioned on the original data, and captions generated by the multi-modal LLM LLaVA2 to guide the generative process. Our work uses open-source models, does not require fine-tuning, and is modular. We demonstrate improved performance across 7 fine-grained datasets, in both few-shot and full dataset settings, across many architectures.