NeurIPS Value-Aligned Imitation via focused Satisficing

Poster
in
Workshop: Pluralistic Alignment Workshop

Value-Aligned Imitation via focused Satisficing

Rushit Shah · Nikolaos Agadakos · Synthia Sasulski · Ali Farajzadeh · Sanjiban Choudhury · Brian Ziebart

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

According to satisficing theory, humans often choose acceptable behavior based on their personal aspirations, rather than achieving (near-) optimality. For example, a lunar lander demonstration that successfully lands without crashing might be acceptable to a novice despite being slow or jerky.When human aspirations are much lower than autonomous system capabilities, this can allow learned policies that sufficiently satisfy differing human objectives.Maximizing the likelihood of demonstrator satisfaction also provides guidance for learning under competing objectives that are difficult for existing imitation learning methods to resolve.Using a margin-based objective to guide deep reinforcement learning, our focused satisficing approach to imitation learning seeks a policy that surpasses the demonstrator's aspiration levels---defined over trajectories---on unseen demonstrations without explicitly learning those aspirations. We show experimentally that this focuses the policy to imitate higher quality demonstrations better than existing imitation learning methods, providing much higher rates of guaranteed acceptability to the demonstrator, and competitive true returns on a range of environments.

Chat is not available.

Poster in Workshop: Pluralistic Alignment Workshop

Value-Aligned Imitation via focused Satisficing

Rushit Shah · Nikolaos Agadakos · Synthia Sasulski · Ali Farajzadeh · Sanjiban Choudhury · Brian Ziebart

Poster
in
Workshop: Pluralistic Alignment Workshop