Deep multi-task methods, where several tasks are learned within a single network, have recently attracted increasing attention. Burning point of this attention is their capacity to capture inter-task relationships. Current approaches either only rely on weight sharing, or add explicit dependency modelling by decomposing the task joint distribution using Bayes chain rule. If the latter strategy yields comprehensive inter-task relationships modelling, it requires imposing an arbitrary order into an unordered task set. Most importantly, this sequence ordering choice has been identified as a critical source of performance variations. In this paper, we present Multi-Order Network (MONET), a multi-task learning method with joint task order optimization. MONET uses a differentiable order selection based on soft order modelling inside Birkhoff's polytope to jointly learn task-wise recurrent modules with their optimal chaining order. Furthermore, we introduce warm up and order dropout to enhance order selection by encouraging order exploration. Experimentally, we first validate MONET capacity to retrieve the optimal order in a toy environment. Second, we use an attribute detection scenario to show that MONET outperforms existing multi-task baselines on a wide range of dependency settings. Finally, we demonstrate that MONET significantly extends state-of-the-art performance in Facial Action Unit detection.