Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma
- Others:
- Google Inc.
- Cardiff University
- Independent
- Institut Sophia Agrobiotech (ISA) ; Institut National de la Recherche Agronomique (INRA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)
- COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)
- Google Inc.
Description
We present tournament results and several powerful strategies for the Iterated Prisoner's Dilemma created using reinforcement learning techniques (evolutionary and particle swarm algorithms). These strategies are trained to perform well against a corpus of over 170 distinct opponents, including many well-known and classic strategies. All the trained strategies win standard tournaments against the total collection of other opponents. The trained strategies and one particular human made designed strategy are the top performers in noisy tournaments also.
Abstract
Marc Harper and Vincent Knight contributed equally to this work. Martin Jones, Georgios Koutsovoulos, Nikoleta E. Glynatsi and Owen Campbell also contributed equally to this work.
Abstract
International audience
Additional details
- URL
- https://hal.inrae.fr/hal-02625592
- URN
- urn:oai:HAL:hal-02625592v1
- Origin repository
- UNICA