Abstract:
Large language models (LLMs) are empowering decision-making in open-world agents in several applications, including tool or API usage and answering multiple choice questions (MCQs). However, they often make overconfident, incorrect predictions or "hallucinations", which can be risky in high-stakes settings like healthcare, and finance. To improve safety, we leverage conformal prediction (CP), a model-agnostic framework that provides distribution-free uncertainty quantification. CP transforms a score function, which measures how well an output ``conforms" to a given input, into prediction sets that contain the true answer with high probability. While CP ensures this coverage guarantee for arbitrary scores, the quality of the scores significantly impacts the size of prediction sets. Prior works have relied on LLM logits or other heuristic scores, lacking guarantees on their quality. To address this issue, we propose an optimization framework (CP-OPT) to learn scores that minimize set sizes while maintaining coverage guarantees. Furthermore, leveraging the coverage guarantees of CP, we propose a conformal revision of questions (CROQs) to revise the problem by narrowing down the available choices to those in the prediction set. Our results on MMLU and ToolAlpaca datasets with Llama3 and Phi-3 models show that optimized CP scores reduce the set sizes by up to 13\% and CROQs improves accuracy relatively by up to $4.6\%$ overall and up to $15\%$ in non-trivial parts of the input space.