Skip to yearly menu bar Skip to main content


Poster

Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models

Jiacheng Ye · Shansan Gong · Liheng Chen · Lin Zheng · Jiahui Gao · Han Shi · Chuan Wu · Xin Jiang · Zhenguo Li · Wei Bi · Lingpeng Kong

[ ] [ Project Page ]
Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Recently, diffusion models have garnered significant interest in the field of text processing due to their many potential advantages compared to conventional autoregressive models.In this work, we propose Diffusion-of-Thought (DoT), a novel approach that integrates diffusion models with Chain-of-Thought, a well-established technique for improving the reasoning ability of autoregressive language models. In contrast to autoregressive language models that make decisions in a left-to-right, token-by-token manner, DoT allows reasoning steps to diffuse over time through a diffusion language model and offers greater flexibility in trading-off computation for reasoning performance. Our experimental results demonstrate the effectiveness of DoT in multi-digit multiplication, boolean logic, and grade school math problems. In addition to that, DoT showcases promising self-correction abilities and benefits from existing reasoning-enhancing techniques like self-consistency decoding. Our findings contribute to the understanding and development of reasoning with diffusion language models.

Live content is unavailable. Log in and register to view live content