NeurIPS NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving

Poster
in
Workshop: MATH-AI: The 4th Workshop on Mathematical Reasoning and AI

NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving

Laetitia Teodorescu · Guillaume Baudart · Emilio Arias · marc lelarge

Keywords: [ Reasoning ] [ search ] [ Large language models ] [ LLM agents ] [ mathematics ] [ interactive theorem proving ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Formal theorem proving is challenging for humans as well as for machines. Thanks to recent advances in LLM capabilities, we believe natural language can serve as a universal interface for reasoning about formal proofs. In this paper, 1) we introduce Pétanque, a new lightweight environment to interact with the Coq theorem prover; 2) we present two interactive proof protocols leveraging natural language as an intermediate representation for designing proof steps; 3) we implement beam search over these interaction protocols, using natural language to rerank proof candidates; and 4) we use Pétanque to benchmark our search algorithms. Using our method with GPT-4o we can successfully synthesize proofs for 58% of the first 100/260 lemmas from the newly published Busy Beaver proofs.

Chat is not available.

Poster in Workshop: MATH-AI: The 4th Workshop on Mathematical Reasoning and AI

NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving

Laetitia Teodorescu · Guillaume Baudart · Emilio Arias · marc lelarge

Poster
in
Workshop: MATH-AI: The 4th Workshop on Mathematical Reasoning and AI