NeurIPS A Statistical Approach to Quantifying LLM Human Alignment

Poster
in
Workshop: Statistical Frontiers in LLMs and Foundation Models

A Statistical Approach to Quantifying LLM Human Alignment

Harbin Hong · Liu Leqi · Sebastian Caldas

Keywords: [ Hypothesis Testing ] [ Statistical Benchmarks for LLM ] [ LLM-Human Opinion Alignment ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 14 Dec noon PST — 12:45 p.m. PST

Abstract:

The usage of Large Language Models (LLMs) in tasks traditionally done by humans is gaining attention across many fields of study such as economics and marketing. As researchers continue to study how to use LLMs in these human subject studies, the divergence in analytical frameworks and their conclusions presents a unique problem. In this work, we present a general quantitative framework for testing the misalignment between language models and human sub-populations using hypothesis testing to output a principled and interpretable result to conclude whether a model is not representative enough to simulate human opinion. Upon application of this framework to OpenAI's GPT-3.5-Turbo model and various Pew Research Center Surveys, we found that GPT-3.5-Turbo is largely misaligned for all the tested sub-populations (e.g. race, age, income, etc.) indicating an inability of GPT-3.5-Turbo to accurately model human opinion for contentious questions.

Chat is not available.

Poster in Workshop: Statistical Frontiers in LLMs and Foundation Models

A Statistical Approach to Quantifying LLM Human Alignment

Harbin Hong · Liu Leqi · Sebastian Caldas

Poster
in
Workshop: Statistical Frontiers in LLMs and Foundation Models