Abstract:Due to their complex dynamics, combinatorial games are a key test case and application for algorithms that train game playing agents. Among those algorithms that train using self-play are coevolutionary algorithms (CoEAs). CoEAs evolve a population of individuals by iteratively selecting the strongest based on their interactions against contemporaries, and using those selected as parents for the following generation (via randomised mutation and crossover). However, the successful application of CoEAs for game playing is difficult due to pathological behaviours such as cycling, an issue especially critical for games with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided by runtime analysis. In this paper, we push the scope of runtime analysis to combinatorial games, proving a general upper bound for the number of simulated games needed for UMDA (a type of CoEA) to discover (with high probability) an optimal strategy for an impartial combinatorial game. This result applies to any impartial combinatorial game, and for many games the implied bound is polynomial or quasipolynomial as a function of the number of game positions. After proving the main result, we provide several applications to simple well-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the first runtime analysis for CoEAs on combinatorial games, this result is a critical step towards a comprehensive theoretical framework for coevolution.