Skip to yearly menu bar Skip to main content


Poster

Improved Regret of Linear Ensemble Sampling

Harin Lee · Min-hwan Oh

[ ]
Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract: In this work, we close the fundamental gap of theory and practice by providing an improved regret bound for linear ensemble sampling. We prove that with an ensemble size logarithmic in \(T\), linear ensemble sampling can achieve a frequentist regret bound of \(\tilde{\Ocal}(d^{3/2}\sqrt{T})\), matching state-of-the-art results for randomized linear bandit algorithms. Our approach introduces a general regret analysis framework for linear bandit algorithms. Additionally, we reveal a significant relationship between ensemble sampling and Perturbed-History Exploration (PHE), showing that LinPHE is a special case of linear ensemble sampling when the ensemble size equals \(T\). This insight allows us to derive a new regret bound of \(\tilde{\Ocal}(d^{3/2}\sqrt{T})\) for LinPHE, independent of $K$. Our contributions advance the theoretical foundation of ensemble sampling, bringing its regret bounds in line with the best known bounds for other randomized exploration algorithms.

Live content is unavailable. Log in and register to view live content