NeurIPS Poster Speculative Monte-Carlo Tree Search

Poster

Speculative Monte-Carlo Tree Search

Scott Cheng · Mahmut T Kandemir · Ding-Yong Hong

West Ballroom A-D #6605

[ Abstract ]

[ Paper] [ Slides] [ Poster] [ OpenReview]

Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract: Monte-Carlo tree search (MCTS) is an influential sequential decision-making algorithm notably employed in AlphaZero. Despite its success, the primary challenge in AlphaZero training lies in its prolonged time-to-solution due to the high latency imposed by the sequential MCTS process. To address this challenge, this paper proposes and evaluates an inter-decision parallelization strategy called speculative MCTS, a new type of parallelism in AlphaZero which implements speculative execution. This approach allows for the parallel execution of future moves before the current MCTS computations are completed, thus reducing the latency. Additionally, we analyze factors contributing to the overall speedup by studying the synergistic effects of speculation and neural network caching in MCTS. We also provide an analytical model that can be used to evaluate the potential of different speculation strategies before they are implemented and deployed. Our empirical findings indicate that the proposed speculative MCTS can reduce training latency by 5.81$\times$ in 9x9 Go games. Moreover, our study shows that speculative execution can enhance the NN cache hit rate by 26\% during midgame. Overall, our end-to-end evaluation indicates 1.91$\times$ speedup in 19x19 Go training time, compared to the state-of-the-art KataGo program.

Chat is not available.