Poster
Bandit Social Learning under Myopic Behavior
Kiarash Banihashem · MohammadTaghi Hajiaghayi · Suho Shin · Aleksandrs Slivkins
Great Hall & Hall B1+B2 (level 1) #1827
Abstract:
We study social learning dynamics motivated by reviews on online platforms. Theagents collectively follow a simple multi-armed bandit protocol, but each agentacts myopically, without regards to exploration. We allow a wide range of myopicbehaviors that are consistent with (parameterized) confidence intervals for the arms’expected rewards. We derive stark exploration failures for any such behavior, andprovide matching positive results. As a special case, we obtain the first generalresults on failure of the greedy algorithm in bandits, thus providing a theoreticalfoundation for why bandit algorithms should explore.
Chat is not available.