Poster
in
Workshop: NeurIPS 2023 Workshop on Diffusion Models
Latent Diffusion for Document Generation with Sequential Decoding
Zihuiwen Ye · Elle Michelle Yang · Phil Blunsom
We present a new document-generation model called LaDiDa, which stands for Latent Diffusion for Document Generation with Sequential Decoding. Large language models (LLMs) can create impressive texts, but the qualities of the documents degrade as the output lengthens. Over time, models struggle to maintain discourse coherence and desirable text dynamics, leading to rambling and repetitive results. This difficulty with long-range generation can often be attributed to the autoregressive training objective, which causes compounding errors over multiple steps. LaDiDa is a hierarchical model for improved long-text generation by decomposing the task on the document and sentence level. Our method is comprised of document-level diffusion and sentence-level decoding, where diffusion is used to globally and non-autoregressively plan sentences within a document and decoding is used to locally and sequentially generate those sentences. Compared to autoregressive models, LaDiDa is able to achieve high textual diversity and structural cohesion in text generation.