Poster
AutoPSV: Automated Process-Supervised Verifier
Jianqiao Lu · Zhiyang Dou · Hongru WANG · Zeyu Cao · Jianbo Dai · Yunlong Feng · Zhijiang Guo
East Exhibit Hall A-C #3011
In this work, we propose a novel method named \textbf{Auto}mated \textbf{P}rocess-\textbf{S}upervised \textbf{V}erifier (\textbf{\textsc{AutoPSV}}) to enhance the reasoning capabilities of large language models (LLMs) by automatically annotating the reasoning steps.\textsc{AutoPSV} begins by training a verification model on the correctness of final answers, enabling it to generate automatic process annotations. This verification model assigns a confidence score to each reasoning step, indicating the probability of arriving at the correct final answer from that point onward.We detect relative changes in the verification's confidence scores across reasoning steps to automatically annotate the reasoning process, enabling error detection even in scenarios where ground truth answers are unavailable. This alleviates the need for numerous manual annotations or the high computational costs associated with model-induced annotation approaches.We experimentally validate that the step-level confidence changes learned by the verification model trained on the final answer correctness can effectively identify errors in the reasoning steps.We demonstrate that the verification model, when trained on process annotations generated by \textsc{AutoPSV}, exhibits improved performance in selecting correct answers from multiple LLM-generated outputs.Notably, we achieve substantial improvements across five datasets in mathematics and commonsense reasoning. The source code of \textsc{AutoPSV} is available at \url{https://github.com/rookie-joe/AutoPSV}.