Abstract:The graph colouring problem consists of assigning labels, or colours, to the vertices of a graph such that no two adjacent vertices share the same colour. In this work we investigate whether deep reinforcement learning can be used to discover a competitive construction heuristic for graph colouring. Our proposed approach, ReLCol, uses deep Q-learning together with a graph neural network for feature extraction, and employs a novel way of parameterising the graph that results in improved performance. Using standard benchmark graphs with varied topologies, we empirically evaluate the benefits and limitations of the heuristic learned by ReLCol relative to existing construction algorithms, and demonstrate that reinforcement learning is a promising direction for further research on the graph colouring problem.