Online Bayesian Persuasion
Matteo Castiglioni, Andrea Celli, Alberto Marchesi, Nicola Gatti
Spotlight presentation: Orals & Spotlights Track 11: Learning Theory
on 2020-12-08T08:30:00-08:00 - 2020-12-08T08:40:00-08:00
on 2020-12-08T08:30:00-08:00 - 2020-12-08T08:40:00-08:00
Poster Session 2 (more posters)
on 2020-12-08T09:00:00-08:00 - 2020-12-08T11:00:00-08:00
GatherTown: Bandit algorithms and reinforcement learning ( Town B3 - Spot B0 )
on 2020-12-08T09:00:00-08:00 - 2020-12-08T11:00:00-08:00
GatherTown: Bandit algorithms and reinforcement learning ( Town B3 - Spot B0 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Toggle Abstract Paper (in Proceedings / .pdf)
Abstract: In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real economic scenarios. However, the original model by Kamenica and Gentzkow makes some stringent assumptions which limit its applicability in practice. One of the most limiting assumptions is arguably that, in order to compute an optimal signaling scheme, the sender is usually required to know the receiver's utility function. In this paper, we relax this assumption through an online learning framework in which the sender faces a receiver with unknown type. At each round, the receiver's type is chosen adversarially from a finite set of possible types. We are interested in no-regret algorithms prescribing a signaling scheme at each round of the repeated interaction with performances close to that of the best-in-hindsight signaling scheme. First, we prove a hardness result on the per-iteration running time required to achieve the no-regret property. Then, we provide algorithms for the full and partial information model which exhibit regret sublinear in the number of rounds and polynomial in the parameters of the game.