NeurIPS Stepwise Inference in Transformers: Exploring a Synthetic Graph Navigation Task

Poster
in
Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)

Stepwise Inference in Transformers: Exploring a Synthetic Graph Navigation Task

Mikail Khona · Maya Okawa · Rahul Ramesh · Kento Nishi · Robert Dick · Ekdeep S Lubana · Hidenori Tanaka

[ Abstract ]

Abstract:

Taking correct steps through elementary logical operations is the essence of logical reasoning, culminating in precise planning outcomes. While such \emph{stepwise inference} approaches have demonstrated benefits in Large Language Models (LLMs), conducting an accurate quantitative evaluation is challenging, given their extensive scale, complexity, and lack of accessibility.We introduce a minimal synthetic setup, where an autoregressive language model solves a navigation task on directed acyclic graphs (DAGs), taking inspiration from computational graphs and execution traces.By implementing training with sample paths from start to goal node in a 'step-by-step' manner, we perform systematic experiments and develop novel analyses illustrating that stepwise navigation proves advantageous when the underlying graph is hierarchical and generalization necessitates the stitching of subpaths observed during pretraining. Further, we observe a diversity-accuracy tradeoff while varying sampling temperature and a bias towards generating shorter paths.We next elucidate how in-context chain-of-thought exemplars can steer the model's navigation. Importantly, these exemplars can guide the model to follow a path of reasoning we provide, instead of relying on its potentially biased priors. Together, this work showcases the utility and adaptability of this paradigm in exploring the complexities of logical reasoning and planning in LLMs.

Chat is not available.

Poster in Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)

Stepwise Inference in Transformers: Exploring a Synthetic Graph Navigation Task

Mikail Khona · Maya Okawa · Rahul Ramesh · Kento Nishi · Robert Dick · Ekdeep S Lubana · Hidenori Tanaka

Poster
in
Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)