NeurIPS Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments

Poster
in
Workshop: NeuroAI: Fusing Neuroscience and AI for Intelligent Solutions

Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments

Tomáš Gavenčiak · David Hyland · Lancelot Da Costa · Michael Wooldridge · Jan Kulveit

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

We introduce the Path Divergence Objective (PDO), a novel model of boundedly-rational decision-making in stochastic, partially-observable environments. The PDO is derived from fundamental physical principles, including embodiment and the inherent costs of information processing. This framework enables us to model key features observed in real-world agent behavior, such as curiosity-driven exploration, novelty-seeking, and the intention-behavior gap. By adjusting a single parameter, the PDO can describe a continuous spectrum of decision-making strategies, ranging from highly irrational to perfectly rational. This flexibility makes the PDO applicable to a wide range of scenarios, including modeling biological organisms, simulating interactions between agents with varying degrees of bounded rationality, addressing AI alignment challenges, and designing AI systems that interact more effectively with humans.

Chat is not available.

Poster in Workshop: NeuroAI: Fusing Neuroscience and AI for Intelligent Solutions

Path Divergence Objective: Boundedly-Rational Decision Making in Partially Observable Environments

Tomáš Gavenčiak · David Hyland · Lancelot Da Costa · Michael Wooldridge · Jan Kulveit

Poster
in
Workshop: NeuroAI: Fusing Neuroscience and AI for Intelligent Solutions