Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Compositional Learning: Perspectives, Methods, and Paths Forward

Unraveling the Latent Hierarchical Structure of Language and Images via Diffusion Models

Antonio Sclocchi · Noam Levi · Alessandro Favero · Matthieu Wyart

Keywords: [ diffusion models ] [ data structure ] [ hierarchical compositionality ] [ statistical physics ]


Abstract:

High-dimensional data must be highly structured to be learnable. Although the compositional and hierarchical nature of data is often put forward to explain learnability, quantitative measurements establishing these properties are scarce. Likewise, accessing the latent variables underlying such a data structure remains a challenge. Forward-backward experiments in diffusion-based models, where a datum is noised and then denoised, are a promising tool to achieve these goals. We predict in simple hierarchical models that in this process, changes in data occur by correlated chunks, with a length scale that diverges at a noise level where a phase transition is known to take place. Remarkably, we confirm this prediction in both text and image datasets using state-of-the-art diffusion models. Our results suggest that forward-backward experiments are informative on the nature of latent variables, and that the effect of changing deeper ones is revealed near the transition.

Chat is not available.