Poster
in
Workshop: Learning and Decision-Making with Strategic Feedback (StratML)
Bounded Rationality for Multi-Agent Motion Planning and Behavior Learning
Junhong Xu · Kai Yin · Lantao Liu
When a robot shares the same workspace with other intelligent agents (e.g., other robots or humans), the robot must be able to reason the behaviors of its neighboring agents while accomplishing its designated task. We assume other agents do not explicitly share their decisions or reasoning processes with the robot. Thus the robot needs to reason itself for the consequences of all other agents, the process of which demands prohibitive computational resources. We observe that in practice, oftentimes predicting the absolutely optimal agent behaviors is not necessary. Instead, it is more desirable that the solution can be computed with a close-to-optimal (sub-optimal) but affordable (tractable) computation time. We propose to incorporate the concept of bounded rationality (from an information-theoretic viewpoint) into the general-sum stochastic game setting. Specifically, the robot computes a policy under the information processing constraint, represented as KL-divergence between the default and optimized stochastic policies. The solution to the bounded-optimal policy can be obtained by an importance sampling approach. We show pilot results that this framework allows the robot to (1) effectively trade-off between solution optimality and limited computational time; (2) efficiently learn the sub-optimal behaviors of other agents.