Poster
in
Workshop: MATH-AI: Toward Human-Level Mathematical Reasoning
ProofNet: A Benchmark for Autoformalizing and Formally Proving Undergraduate-Level Mathematics Problems
Zhangir Azerbayev · Bartosz Piotrowski · Jeremy Avigad
We introduce \textsf{ProofNet}, a benchmark for autoformalization and formal proving of undergraduate-level mathematics. The \textsf{ProofNet} benchmarks consists of 297 theorem statements expressed in both natural language and the Lean 3 theorem prover, 100 of which are also accompanied by natural language proofs. The problems are primarily drawn from popular undergraduate pure mathematics textbooks, and cover topics such as real and complex analysis, linear algebra, abstract algebra, and topology. We intend for \textsf{ProofNet} to be a challenging benchmark that will drive progress in autoformalization and automatic theorem proving. We report baseline results on the autoformalization of statements using few-shot learning with large language models.