Spotlight Poster
Diffusion Models With Learned Adaptive Noise
Subham Sahoo · Aaron Gokaslan · Christopher De Sa · Volodymyr Kuleshov
Diffusion models have gained traction as powerful algorithms for synthesizing high-quality images. Central to these algorithms is the diffusion process, a set of equations which maps data to noise in a way that can significantly affect performance. In this paper, we explore whether the diffusionprocess can be learned from data.Our work is grounded in Bayesian inference and seeks to improve log-likelihood estimation by casting the learned diffusion process as an approximate variational posterior that yields a tighter lower bound (ELBO) on the likelihood.A widely held assumption is that the ELBO is invariant to the noise process: our work dispels this assumption and proposes multivariate learned adaptive noise (MuLAN), a learned diffusion process that applies noise at different rates across an image. Our method consists of three components: a multivariate noise schedule, adaptive input-conditional diffusion, and auxiliary variables; these components ensure that the ELBO is no longer invariant to the choice of the noise schedule as in previous works. Empirically, MuLAN sets a new state-of-the-art in density estimation on CIFAR-10 and ImageNet while matching the performance of previous state-of-the-art models with 50% fewer steps.
Live content is unavailable. Log in and register to view live content