Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)

Weak-to-Strong In-Context Optimization of Language Model Reasoning

Keshav Ramji · Alok Shah · Vedant Gaur · Khush Gupta


Abstract:

Large language models (LLMs) have demonstrated remarkable in-context learning capabilities, leveraging demonstrations to adeptly perform a task. Recent works have shown that such models can perform optimization over a response scoring function, evaluating the quality of suboptimal generations and applying them as exemplars to produce a better response. In this work, we seek to further explore this phenomenon and determine whether strong LLMs can optimize their reasoning paths by leveraging differentiated copies of a weak model. Central to our approach is the use of filler tokens interleaved after each step in the reasoning chain. We then define reasoning optimality, our implicit objective function, in terms of the "efficiency" as measured by the number of steps. At inference time, three copies of the weak model fine-tuned on synthetic data with varying degrees of efficiency are used to generate responses for in-context optimization with the strong model. We evaluate this method on the MMLU benchmark with Gemma-2 2B-it weak learners and Llama-3.1-405B-Instruct as the strong model, and demonstrate that our approach improves performance in a cheap manner.

Chat is not available.