Skip to yearly menu bar Skip to main content


Poster

AID: Attention Interpolation of Text-to-Image Diffusion

He Qiyuan · Jinghao Wang · Ziwei Liu · Angela Yao

[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Conditional diffusion models can create unseen images in various settings, aiding image interpolation. Interpolation in latent spaces is well-studied, but interpolation with specific conditions like text or pose is less understood. Common and simple approaches use linear interpolation in the conditioning space but tend to result in inconsistent images with poor fidelity. This work introduces a novel training-free technique named \textbf{Attention Interpolation via Diffusion (AID)}. AID has two key contributions: \textbf{1)} a fused inner/outer interpolated attention layer to boost image consistency and fidelity; and \textbf{2)} selection of interpolation coefficients via a beta distribution to increase smoothness. Additionally, we present an AID variant called \textbf{Prompt-guided Attention Interpolation via Diffusion (PAID)}, which \textbf{3)} treats interpolation as a condition-dependent generative process. Experiments demonstrate that our method achieves greater consistency, smoothness, and efficiency in condition-based interpolation, aligning closely with human preferences. Furthermore, PAID can significantly benefit compositional generation and image editing control, all while remaining training-free.

Live content is unavailable. Log in and register to view live content