NeurIPS Does GPT Really Get It? A Hierarchical Scale to Quantify Human and AI's Understanding of Algorithms

Poster
in
Workshop: Workshop on Behavioral Machine Learning

Does GPT Really Get It? A Hierarchical Scale to Quantify Human and AI's Understanding of Algorithms

Mirabel Reid · Santosh Vempala

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

As Large Language Models (LLMs) are used for increasingly complex cognitive tasks, a natural question is whether AI really {\em understands}. The study of understanding in LLMs is in its infancy, and the community has yet to incorporate research and insights from philosophy, psychology, and education. Here we focus on understanding {\em algorithms}, and propose a hierarchy of levels of understanding. We validate the hierarchy using a study with human subjects (undergraduate and graduate students). Following this, we apply the hierarchy to large language models (generations of GPT), revealing interesting similarities and differences with humans. We expect that our rigorous criteria for algorithm understanding will help monitor and quantify AI's progress in such cognitive domains.

Chat is not available.

Poster in Workshop: Workshop on Behavioral Machine Learning

Does GPT Really Get It? A Hierarchical Scale to Quantify Human and AI's Understanding of Algorithms

Mirabel Reid · Santosh Vempala

Poster
in
Workshop: Workshop on Behavioral Machine Learning