NeurIPS Poster Distributional Pareto-Optimal Multi-Objective Reinforcement Learning

Poster

Distributional Pareto-Optimal Multi-Objective Reinforcement Learning

Xin-Qiang Cai · Pushi Zhang · Li Zhao · Jiang Bian · Masashi Sugiyama · Ashley Llorens

Great Hall & Hall B1+B2 (level 1) #1407

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract:

Multi-objective reinforcement learning (MORL) has been proposed to learn control policies over multiple competing objectives with each possible preference over returns. However, current MORL algorithms fail to account for distributional preferences over the multi-variate returns, which are particularly important in real-world scenarios such as autonomous driving. To address this issue, we extend the concept of Pareto-optimality in MORL into distributional Pareto-optimality, which captures the optimality of return distributions, rather than the expectations. Our proposed method, called Distributional Pareto-Optimal Multi-Objective Reinforcement Learning~(DPMORL), is capable of learning distributional Pareto-optimal policies that balance multiple objectives while considering the return uncertainty. We evaluated our method on several benchmark problems and demonstrated its effectiveness in discovering distributional Pareto-optimal policies and satisfying diverse distributional preferences compared to existing MORL methods.

Chat is not available.