Published April 24, 2024 | Version v1
Publication

Improving continuous Monte Carlo Tree Search

Description

Monte-Carlo Tree Search (MCTS) is largely responsible for the improvement not only of many computer games, including Go and General Game Playing (GPP), but also of real-world continuous Markov decision process problems. MCTS initially uses the Upper Confidence bounds applied to Trees (UCT), but the Rapid Action Value Estimation (RAVE) heuristic has rapidly taken over in the discrete and continuous domains. Recently, generalized RAVE (GRAVE) outperformed such heuristics in the discrete domain. This paper is concerned with extending the GRAVE heuristic to continuous action and state spaces. To enhance its performances, we suggest an action decomposition strategy to break down multidimensional actions into multiple unidimensional actions, and we propose a selective policy based on constraints that can be used to bias the playouts and in the tree to select promising actions. The approach is experimentally validated on a real-world biological problem.

Additional details

Identifiers

URL
https://hal.science/hal-04557914
URN
urn:oai:HAL:hal-04557914v1

Origin repository

Origin repository
UNICA