Abstract:Global optimization of decision trees has shown to be promising in terms of accuracy, size, and consequently human comprehensibility. However, many of the methods used rely on general-purpose solvers for which scalability remains an issue. Dynamic programming methods have been shown to scale much better because they exploit the tree structure by solving subtrees as independent subproblems. However, this only works when an objective can be optimized separately for subtrees. We explore this relationship in detail and show necessary and sufficient conditions for such separability and generalize previous dynamic programming approaches into a framework that can optimize any combination of separable objectives and constraints. Experiments on four application domains show the general applicability of this framework, while outperforming the scalability of general-purpose solvers by a large margin.
Abstract:In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value. Typical algorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP setting (MMDP) such structure is not present. We propose a new optimal solver for transition-independent MMDPs, in which agents can only affect their own state but their reward depends on joint transitions. We represent these dependencies compactly in conditional return graphs (CRGs). Using CRGs the value of a joint policy and the bounds on partially specified joint policies can be efficiently computed. We propose CoRe, a novel branch-and-bound policy search algorithm building on CRGs. CoRe typically requires less runtime than the available alternatives and finds solutions to problems previously unsolvable.
Abstract:We present two new and efficient algorithms for computing all-pairs shortest paths. The algorithms operate on directed graphs with real (possibly negative) weights. They make use of directed path consistency along a vertex ordering d. Both algorithms run in O(n^2 w_d) time, where w_d is the graph width induced by this vertex ordering. For graphs of constant treewidth, this yields O(n^2) time, which is optimal. On chordal graphs, the algorithms run in O(nm) time. In addition, we present a variant that exploits graph separators to arrive at a run time of O(n w_d^2 + n^2 s_d) on general graphs, where s_d andlt= w_d is the size of the largest minimal separator induced by the vertex ordering d. We show empirically that on both constructed and realistic benchmarks, in many cases the algorithms outperform Floyd-Warshalls as well as Johnsons algorithm, which represent the current state of the art with a run time of O(n^3) and O(nm + n^2 log n), respectively. Our algorithms can be used for spatial and temporal reasoning, such as for the Simple Temporal Problem, which underlines their relevance to the planning and scheduling community.