NeurIPS Automating Thought of Search: A Journey Towards Soundness and Completeness

Poster
in
Workshop: Workshop on Open-World Agents: Synnergizing Reasoning and Decision-Making in Open-World Environments (OWA-2024)

Automating Thought of Search: A Journey Towards Soundness and Completeness

Daniel Cao · Michael Katz · Harsha Kokel · Kavitha Srinivas · Shirin Sohrabi Araghi

Keywords: [ planning with language models ]

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

Planning remains one of the last standing bastions for large language models (LLMs), which now turn their attention to search. Most of the literature uses the language models as world models to define the search space, forgoing soundness for the sake of flexibility. A recent work, Thought of Search (ToS), proposed defining the search space with code, having the language models produce that code. ToS requires a human in the loop, collaboratively producing a sound successor function and goal test. The result, however, is worth the effort: all the tested datasets were solved with 100% accuracy. At the same time LLMs have demonstrated significant progress in code generation and refinement for complex reasoning tasks.In this work, we automate ToS (AutoToS), completely taking the human out of the loop of solving planning problems. AutoToS guides the language model step by step towards the generation of sound and complete search components, through feedback from both generic and domain specific unit tests. We achieve 100% accuracy, with minimal feedback iterations, using LLMs of various sizes on all evaluated domains.

Chat is not available.

Poster in Workshop: Workshop on Open-World Agents: Synnergizing Reasoning and Decision-Making in Open-World Environments (OWA-2024)

Automating Thought of Search: A Journey Towards Soundness and Completeness

Daniel Cao · Michael Katz · Harsha Kokel · Kavitha Srinivas · Shirin Sohrabi Araghi

Poster
in
Workshop: Workshop on Open-World Agents: Synnergizing Reasoning and Decision-Making in Open-World Environments (OWA-2024)