Spotlight
in
Workshop: Reinforcement Learning for Real Life (RL4RealLife) Workshop
Optimizing Audio Recommendations for the Long-Term
Lucas Maystre · Daniel Russo · Yu Zhao
We study the problem of optimizing recommender systems for outcomes that realize over several weeks or months. Successfully addressing this problem requires overcoming difficult statistical and organizational challenges. We begin by drawing on reinforcement learning to formulate a comprehensive model of users' recurring relationship with a recommender system. We then identify a few key assumptions that lead to simple, testable recommender system prototypes that explicitly optimize for the long-term. We apply our approach to a podcast recommender system at a large online audio streaming service, and we demonstrate that purposefully optimizing for long-term outcomes can lead to substantial performance gains over approaches optimizing for short-term proxies.