Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Behavioral Machine Learning

Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding

Yun-Shiuan Chuang · Nikunj Harlalka · Sameer Narendran · Alexander Cheung · Sizhe Gao · Siddharth Suresh · Junjie Hu · Timothy T Rogers


Abstract:

Guesstimation, the task of making approximate quantity estimates, is a common real-world challenge. However, it has been largely overlooked in large language models (LLMs) research. We introduce a novel guesstimation dataset, MARBLES. This dataset requires one to estimate how many items (e.g., marbles) can fit into containers (e.g., a one-cup measuring cup), both with and without accompanying images. Inspired by the social science concept of the ''Wisdom of Crowds'' (WOC) - taking the median from estimates from a crowd), which has proven effective in guesstimation, we propose ''WOC decoding'' strategy for LLM guesstimation. We show that LLMs perform well on guesstimation, suggesting that they possess some level of a "world model" necessary for guesstimation. Moreover, similar to human performance, the WOC decoding method improves LLM guesstimation accuracy. Furthermore, the inclusion of images in the multimodal condition enhances model performance. These results highlight the value of WOC decoding strategy for LLMs and position guesstimation as a probe for evaluating LLMs' world model.

Chat is not available.