Poster
in
Workshop: MATH-AI: The 4th Workshop on Mathematical Reasoning and AI
Intermediate Fine-Tuning Improves Mathematical Reasoning in Smaller Models
Neeraj Gangwar · Suma Bhat · Nickvash Kani
Keywords: [ Math Reasoning ] [ Reasoning ] [ intermediate fine-tuning ]
While large models pre-trained on high-quality data exhibit excellent performance across various reasoning tasks, including mathematical reasoning (e.g. GSM8k, MultiArith), specializing smaller models in mathematical reasoning remains a challenging problem. A common research approach to address this challenge involves distilling knowledge from large pre-trained teacher models into smaller student models. Other techniques include augmenting datasets by rephrasing questions or using multiple views of solutions to improve reasoning performance. In this work, we explore intermediate fine-tuning and show that fine-tuning a model on an arithmetic dataset before fine-tuning it on a reasoning dataset helps improve the model's performance on the reasoning tasks. The arithmetic dataset can be generated programmatically, eliminating the resource-intensive task of dataset creation. We evaluate the impact of intermediate fine-tuning using the original GSM8k training set and an expanded GSM8k training set created through distillation. Our experiments on multiple datasets demonstrate that intermediate fine-tuning leads to average improvements of 6.3% and 14.2% in reasoning tasks using the original and distilled training sets, respectively, with greedy decoding compared to the models fine-tuned directly on these sets.