Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Statistical Frontiers in LLMs and Foundation Models

Reexpress: Similarity-Distance-Magnitude Calibration

Allen Schmaltz

Keywords: [ agent-based LLM systems ] [ LLM interpretability ] [ on-device AI ] [ LLM visualization ] [ compound LLM systems ] [ retrieval-based LLM systems ] [ LLMs ]

[ ] [ Project Page ]
Sat 14 Dec 3:45 p.m. PST — 4:30 p.m. PST

Abstract:

In this system demonstration paper, we introduce Reexpress, a no-code, visual data analysis platform that leverages neural networks to obtain reliable and introspectable predictions over high-dimensional inputs. As a core part of Reexpress, we propose Similarity-Distance-Magnitude Calibration, a novel decoupling of aleatoric (irreducible) uncertainty and epistemic (reducible) uncertainty. Epistemic uncertainty is decomposed into the Similarity to training, the Distance to training, and a CDF-threshold based transform of the class-conditional output Magnitude, and summarized (along with the sample size) into an estimate of Calibration Reliability that accompanies every calibrated probability. The probabilities are frequency-based estimates of observable data partitions. This approach yields estimates with a degree of interpretability and actionability missing from extant approaches. Importantly, unlike other approaches for deriving uncertainty over large language models, Reexpress produces calibrated probabilities robust to the types of co-variate and label shifts that can be unexpectedly encountered when modeling high-dimensional data. Dense matching capabilities at the document- and feature-levels, and automatically generated interactive visualizations, enable auditing of the data (including label quality), the predictions, and the predictive uncertainty. We demonstrate the on-device version of Reexpress in two representative settings: Fact verification over the latent knowledge of Mixtral 8x7B in a distribution-shifted setting, and deriving uncertainty over a series of complex legal reasoning tasks using GPT-4. Finally, we provide a preview of the cloud version by building the first uncertainty-aware generative AI-assisted search client, demonstrating how Reexpress is the substrate for building reliable multi-stage compound/agent-based systems that can conditionally branch for additional test-time compute based on the predictive uncertainty.

Chat is not available.