Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits
Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida
Spotlight presentation: Orals & Spotlights Track 24: Learning Theory
on 2020-12-09T19:10:00-08:00 - 2020-12-09T19:20:00-08:00
on 2020-12-09T19:10:00-08:00 - 2020-12-09T19:20:00-08:00
Poster Session 5 (more posters)
on 2020-12-09T21:00:00-08:00 - 2020-12-09T23:00:00-08:00
GatherTown: Bandit Algorithms / Online Learning ( Town E1 - Spot D0 )
on 2020-12-09T21:00:00-08:00 - 2020-12-09T23:00:00-08:00
GatherTown: Bandit Algorithms / Online Learning ( Town E1 - Spot D0 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Toggle Abstract Paper (in Proceedings / .pdf)
Abstract: We propose novel algorithms with first- and second-order regret bounds for adversarial linear bandits. These regret bounds imply that our algorithms perform well when there is an action achieving a small cumulative loss or the loss has a small variance. In addition, we need only assumptions weaker than those of existing algorithms; our algorithms work on discrete action sets as well as continuous ones without a priori knowledge about losses, and they run efficiently if a linear optimization oracle for the action set is available. These results are obtained by combining optimistic online optimization, continuous multiplicative weight update methods, and a novel technique that we refer to as distribution truncation. We also show that the regret bounds of our algorithms are tight up to polylogarithmic factors.