Poster
in
Workshop: Statistical Frontiers in LLMs and Foundation Models
Bayesian Concept Bottleneck Models with LLM Priors
Jean Feng · Avni Kothari · Lucas Zier · Chandan Singh · Yan Shuo Tan
Keywords: [ Concept bottleneck models ] [ Bayesian inference ] [ Interpretability ] [ Large language models ]
Concept Bottleneck Models (CBMs) are ``white-box'' models that map inputs onto a set of human-interpretable concepts, which are then used to make predictions. However, current approaches for learning CBMs (i) typically require a predefined set of concepts even though the relevant concepts are often unknown a priori, (ii) dense concept annotations for each observation, and (iii) do not provide rigorous uncertainty quantification to indicate which concepts are truly relevant. This work addresses these three challenges by introducing \method, where the central idea is to view large language models (LLMs) as a mechanism for generating imperfect priors and annotations for a potentially \textit{infinite} set of human-interpretable concepts and use Bayesian machinery to obtain statistically rigorous inference.BC-LLM is a general framework that is applicable to multiple data modalities, including tabular data, images, and text. Experimental results demonstrate that BC-LLM learns CBMs that achieve better performance than existing methods, converges more quickly to the true concepts, "learns away'' spurious correlations with accumulating data, and can suggest features for ML developers to engineer.