Poster
in
Workshop: MATH-AI: The 4th Workshop on Mathematical Reasoning and AI
Synthesizing Verified Mathematical Problems
Xuefeng Li · Yanheng He · Pengfei Liu
Keywords: [ Large Language Model; Mathematical Reasoning; Data Synthesis ]
Mathematical data synthesis offers a potentially effective solution for enhancing the mathematical capabilities of large language models. However, existing methods either synthesize a large number of rationales based on existing questions, limiting the diversity of the questions, or rely on advanced proprietary models to directly generate new questions without verification, which cannot guarantee the correctness of the synthesized problems. This paper introduces a novel method, mathematical data synthesis through Algorithmic \textbf{A}bstraction, \textbf{I}mplementation, and \textbf{C}ontextualization (AIC), to synthesize new and verifiable mathematical problems. \textbf{AIC} abstracts mathematical problems into algorithms, implements these algorithms as code functions, and contextualizes them under different conditions to create new problems, which are then verified using code functions. Experimental results on multiple challenging mathematical benchmarks show that models fine-tuned on our synthesized data are superior to previous state-of-the-art models. Further experiments indicate that, when controlling for the same synthesizer, data synthesized using the AIC method is not only more accurate but also more effective at improving the model's mathematical abilities.