Poster
in
Workshop: Compositional Learning: Perspectives, Methods, and Paths Forward
Exploring A Bayesian View On Compositional and Counterfactual Generalization
Patrik Reizinger · Rahul Krishnan
Keywords: [ compositionality ] [ causality ] [ compositional generalization ] [ counterfactuals ] [ ood ]
Large models trained on vast data sets can achieve both minimal training and test loss, and, thus, generalize statistically. However, their interesting properties such as good transfer performance or extrapolationconcern out-of-distribution (OOD) data.One desired OOD property is compositional generalization, when models generalize to unseen feature combinations.While compositional generalization promises good performance on a wide range of OOD scenarios, it does not account for the plausibility of such combinations. A ubiquitous example is hallucinations in large models. Building on recent advances in Bayesian causal inference, we propose a unified perspective of counterfactual and compositional generalization. We use a causal world model to reason about the plausibility of unseen combinations. By introducing a Bayesian prior, we show that counterfactual generalization is a special case of compositionality, restricted to realistic combinations. This perspective allows us to formally characterize hallucinations,and opens up new research directions to equip generative AI models with a formally motivated ``switch" between realistic and non-realistic/creative modes.