Abstract:Mathematical proofs can be mechanised using proof assistants to eliminate gaps and errors. However, mechanisation still requires intensive labour. To promote automation, it is essential to capture high-level human mathematical reasoning, which we address as the problem of generating suitable propositions. We build a non-synthetic dataset from the largest repository of mechanised proofs and propose a task on causal reasoning, where a model is required to fill in a missing intermediate proposition given a causal context. Our experiments (using various neural sequence-to-sequence models) reveal that while the task is challenging, neural models can indeed capture non-trivial mathematical reasoning. We further propose a hierarchical transformer model that outperforms the transformer baseline.
Abstract:Heuristics in theorem provers are often parameterised. Modern theorem provers such as Vampire utilise a wide array of heuristics to control the search space explosion, thereby requiring optimisation of a large set of parameters. An exhaustive search in this multi-dimensional parameter space is intractable in most cases, yet the performance of the provers is highly dependent on the parameter assignment. In this work, we introduce a principled probablistic framework for heuristics optimisation in theorem provers. We present results using a heuristic for premise selection and The Archive of Formal Proofs (AFP) as a case study.
Abstract:Cylindrical Algebraic Decomposition (CAD) is a key tool in computational algebraic geometry, best known as a procedure to enable Quantifier Elimination over real-closed fields. However, it has a worst case complexity doubly exponential in the size of the input, which is often encountered in practice. It has been observed that for many problems a change in algorithm settings or problem formulation can cause huge differences in runtime costs, changing problem instances from intractable to easy. A number of heuristics have been developed to help with such choices, but the complicated nature of the geometric relationships involved means these are imperfect and can sometimes make poor choices. We investigate the use of machine learning (specifically support vector machines) to make such choices instead. Machine learning is the process of fitting a computer model to a complex function based on properties learned from measured data. In this paper we apply it in two case studies: the first to select between heuristics for choosing a CAD variable ordering; the second to identify when a CAD problem instance would benefit from Groebner Basis preconditioning. These appear to be the first such applications of machine learning to Symbolic Computation. We demonstrate in both cases that the machine learned choice outperforms human developed heuristics.
Abstract:Cylindrical Algebraic Decomposition (CAD) is a key tool in computational algebraic geometry, particularly for quantifier elimination over real-closed fields. However, it can be expensive, with worst case complexity doubly exponential in the size of the input. Hence it is important to formulate the problem in the best manner for the CAD algorithm. One possibility is to precondition the input polynomials using Groebner Basis (GB) theory. Previous experiments have shown that while this can often be very beneficial to the CAD algorithm, for some problems it can significantly worsen the CAD performance. In the present paper we investigate whether machine learning, specifically a support vector machine (SVM), may be used to identify those CAD problems which benefit from GB preconditioning. We run experiments with over 1000 problems (many times larger than previous studies) and find that the machine learned choice does better than the human-made heuristic.
Abstract:Cylindrical algebraic decomposition(CAD) is a key tool in computational algebraic geometry, particularly for quantifier elimination over real-closed fields. When using CAD, there is often a choice for the ordering placed on the variables. This can be important, with some problems infeasible with one variable ordering but easy with another. Machine learning is the process of fitting a computer model to a complex function based on properties learned from measured data. In this paper we use machine learning (specifically a support vector machine) to select between heuristics for choosing a variable ordering, outperforming each of the separate heuristics.
Abstract:We present a straightforward embedding of quantified multimodal logic in simple type theory and prove its soundness and completeness. Modal operators are replaced by quantification over a type of possible worlds. We present simple experiments, using existing higher-order theorem provers, to demonstrate that the embedding allows automated proofs of statements in these logics, as well as meta properties of them.