Abstract:There has been a growing need to devise processes that can create comprehensive datasets in the world of Computer Algebra, both for accurate benchmarking and for new intersections with machine learning technology. We present here a method to generate integrands that are guaranteed to be integrable, dubbed the LIOUVILLE method. It is based on Liouville's theorem and the Parallel Risch Algorithm for symbolic integration. We show that this data generation method retains the best qualities of previous data generation methods, while overcoming some of the issues built into that prior work. The LIOUVILLE generator is able to generate sufficiently complex and realistic integrands, and could be used for benchmarking or machine learning training tasks related to symbolic integration.
Abstract:Computer Algebra Systems (e.g. Maple) are used in research, education, and industrial settings. One of their key functionalities is symbolic integration, where there are many sub-algorithms to choose from that can affect the form of the output integral, and the runtime. Choosing the right sub-algorithm for a given problem is challenging: we hypothesise that Machine Learning can guide this sub-algorithm choice. A key consideration of our methodology is how to represent the mathematics to the ML model: we hypothesise that a representation which encodes the tree structure of mathematical expressions would be well suited. We trained both an LSTM and a TreeLSTM model for sub-algorithm prediction and compared them to Maple's existing approach. Our TreeLSTM performs much better than the LSTM, highlighting the benefit of using an informed representation of mathematical expressions. It is able to produce better outputs than Maple's current state-of-the-art meta-algorithm, giving a strong basis for further research.
Abstract:There has been an increasing number of applications of machine learning to the field of Computer Algebra in recent years, including to the prominent sub-field of Symbolic Integration. However, machine learning models require an abundance of data for them to be successful and there exist few benchmarks on the scale required. While methods to generate new data already exist, they are flawed in several ways which may lead to bias in machine learning models trained upon them. In this paper, we describe how to use the Risch Algorithm for symbolic integration to create a dataset of elementary integrable expressions. Further, we show that data generated this way alleviates some of the flaws found in earlier methods.
Abstract:Document preparation systems like LaTeX offer the ability to render mathematical expressions as one would write these on paper. Using LaTeX, LaTeXML, and tools generated for use in the National Institute of Standards (NIST) Digital Library of Mathematical Functions, semantically enhanced mathematical LaTeX markup (semantic LaTeX) is achieved by using a semantic macro set. Computer algebra systems (CAS) such as Maple and Mathematica use alternative markup to represent mathematical expressions. By taking advantage of Youssef's Part-of-Math tagger and CAS internal representations, we develop algorithms to translate mathematical expressions represented in semantic LaTeX to corresponding CAS representations and vice versa. We have also developed tools for translating the entire Wolfram Encoding Continued Fraction Knowledge and University of Antwerp Continued Fractions for Special Functions datasets, for use in the NIST Digital Repository of Mathematical Formulae. The overall goal of these efforts is to provide semantically enriched standard conforming MathML representations to the public for formulae in digital mathematics libraries. These representations include presentation MathML, content MathML, generic LaTeX, semantic LaTeX, and now CAS representations as well.
Abstract:In this article we demonstrate how to solve a variety of problems and puzzles using the built-in SAT solver of the computer algebra system Maple. Once the problems have been encoded into Boolean logic, solutions can be found (or shown to not exist) automatically, without the need to implement any search algorithm. In particular, we describe how to solve the $n$-queens problem, how to generate and solve Sudoku puzzles, how to solve logic puzzles like the Einstein riddle, how to solve the 15-puzzle, how to solve the maximum clique problem, and finding Graeco-Latin squares.