Abstract:
In this paper, we consider stochastic multi-armed bandits (MABs) with heavy-tailed rewards, whose p-th moment is bounded by a constant nu_p for 1
Chat is not available.
Successful Page Load