Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Causality and Large Models

CodeSCM: Causal Analysis for Multi-Modal Code Generation

Mukur Gupta · Noopur Bhatt · Suman Jana

Keywords: [ Causal Analysis ] [ Multi-modal code generation ]


Abstract:

In this paper, we propose CodeSCM, a Structural Causal Model (SCM) for analyzing multi-modal code generation using large language models (LLMs). By applying interventions to CodeSCM, we define the causal effects of different prompt modalities, such as natural language, code, and input-output examples, on the model. CodeSCM introduces latent mediator variables to separate the code and natural language semantics of a multi-modal prompt. Using the principles of Causal Mediation Analysis on these mediators, we define direct effects through targeted interventions, quantitatively representing the model's spurious leanings. We find that, in addition to natural language instructions, input-output examples significantly influence model generation, and total causal effects evaluations from CodeSCM also reveal the memorization of code generation benchmarks.

Chat is not available.