Poster
in
Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)
Function Constrained Program Synthesis
Patrick A. Hajali · Ignas Budvytis
This work introduces: (1) a technique that allows pre-trained large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation attempts when the initial code generated by the LLM is inadequate. Generating computer programs in general-purpose programming languages like Python poses a challenge for LLMs when restricted to using only code provided in the prompt. A naive approach is to present a chat-based LLM (e.g. GPT-4, Claude) with relevant code snippets and prompt the model to synthesize the target algorithm using the provided code. Alternatively, code-specific LLMs (e.g. GitHub Copilot, CodeLlama2) can generate code completions in real-time by drawing on all code available in the integrated development environment. However, restricting code-specific LLMs to use only in-context code is not straightforward, as the model is not explicitly instructed to use the user-generated code and users cannot highlight precisely which snippets of code the model should incorporate into its context for subsequent code-generations. Moreover, chat and code LLMs lack effective recovery methods, forcing users to iteratively re-prompt the model with modified prompts until a sufficient solution is reached.Our method differs from traditional LLM-powered code-generation by constraining code-generation to an explicit function set and enabling recovery from failed attempts through automatically generated sub-functions. When the LLM cannot produce working code, we generate modular sub-functions to aid subsequent attempts at generating functional code. A by-product of our method is a library of reusable sub-functions that can solve related tasks (imitating a software team where efficiency scales with experience). We also introduce a new “half-shot” evaluation paradigm that provides tighter estimates of LLMs' coding abilities compared to traditional zero-shot evaluation. Our proposed method encourages models to output solutions in a structured format, decreasing syntax errors that can be mistaken for poor coding ability.