Abstract:In line with the growing trend of using machine learning to improve solving of combinatorial optimisation problems, one promising idea is to improve node selection within a mixed integer programming branch-and-bound tree by using a learned policy. In contrast to previous work using imitation learning, our policy is focused on learning which of a node's children to select. We present an offline method to learn such a policy in two settings: one that is approximate by committing to pruning of nodes; one that is exact and backtracks from a leaf to use a different strategy. We apply the policy within the popular open-source solver SCIP. Empirical results on four MIP datasets indicate that our node selection policy leads to solutions more quickly than the state-of-the-art in the literature, but not as quickly as the state-of-practice SCIP node selector. While we do not beat the highly-optimised SCIP baseline in terms of solving time on exact solutions, our approximation-based policies have a consistently better optimality gap than all baselines if the accuracy of the predictive model adds value to prediction. Further, the results also indicate that, when a time limit is applied, our approximation method finds better solutions than all baselines in the majority of problems tested.