Published April 24, 2024
| Version v1
Publication
Improving continuous Monte Carlo Tree Search
Contributors
Others:
- Scalable and Pervasive softwARe and Knowledge Systems (Laboratoire I3S - SPARKS) ; Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)
- Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision (LAMSADE) ; Université Paris Dauphine-PSL ; Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)
Description
Monte-Carlo Tree Search (MCTS) is largely responsible for the improvement not only of many computer games, including Go and General Game Playing (GPP), but also of real-world continuous Markov decision process problems. MCTS initially uses the Upper Confidence bounds applied to Trees (UCT), but the Rapid Action Value Estimation (RAVE) heuristic has rapidly taken over in the discrete and continuous domains. Recently, generalized RAVE (GRAVE) outperformed such heuristics in the discrete domain. This paper is concerned with extending the GRAVE heuristic to continuous action and state spaces. To enhance its performances, we suggest an action decomposition strategy to break down multidimensional actions into multiple unidimensional actions, and we propose a selective policy based on constraints that can be used to bias the playouts and in the tree to select promising actions. The approach is experimentally validated on a real-world biological problem.
Additional details
Identifiers
- URL
- https://hal.science/hal-04557914
- URN
- urn:oai:HAL:hal-04557914v1
Origin repository
- Origin repository
- UNICA