Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Bayesian Decision-making and Uncertainty: from probabilistic and spatiotemporal modeling to sequential experiment design

Convergence Rates of Bayesian Network Policy Gradient for Cooperative Multi-Agent Reinforcement Learning

Dingyang Chen · Zhenyu Zhang · Xiaolong Kuang · Xinyang Shen · Ozalp Ozer · Qi Zhang

Keywords: [ multi-agent reinforcement learning ] [ bayesian network ] [ multi-agent coordination ]


Abstract:

Human coordination often benefits from executing actions in a correlated manner, leading to improved cooperation. This concept holds potential for enhancing cooperative multi-agent reinforcement learning (MARL). Despite this, recent advances in MARL predominantly focus on decentralized execution, which favors scalability by avoiding action correlation among agents. A recent study introduced a Bayesian network to incorporate correlations between agents' action selections within their joint policy, demonstrating global convergence to Nash equilibria under a tabular softmax policy parameterization in cooperative Markov games. In this work, we extend these theoretical results by proving the convergence rate of the Bayesian network joint policy with log-barrier regularization.

Chat is not available.