Abstract:Decision diagrams (DDs) are powerful tools to represent effectively propositional formulas, which are largely used in many domains, in particular in formal verification and in knowledge compilation. Some forms of DDs (e.g., OBDDs, SDDs) are canonical, that is, (under given conditions on the atom list) they univocally represent equivalence classes of formulas. Given the limited expressiveness of propositional logic, a few attempts to leverage DDs to SMT level have been presented in the literature. Unfortunately, these techniques still suffer from some limitations: most procedures are theory-specific; some produce theory DDs (T-DDs) which do not univocally represent T-valid formulas or T-inconsistent formulas; none of these techniques provably produces theory-canonical T-DDs, which (under given conditions on the T-atom list) univocally represent T-equivalence classes of formulas. Also, these procedures are not easy to implement, and very few implementations are actually available. In this paper, we present a novel very-general technique to leverage DDs to SMT level, which has several advantages: it is very easy to implement on top of an AllSMT solver and a DD package, which are used as blackboxes; it works for every form of DDs and every theory, or combination thereof, supported by the AllSMT solver; it produces theory-canonical T-DDs if the propositional DD is canonical. We have implemented a prototype tool for both T-OBDDs and T-SDDs on top of OBDD and SDD packages and the MathSAT SMT solver. Some preliminary empirical evaluation supports the effectiveness of the approach.
Abstract:The probabilistic formal verification (PFV) of AI systems is in its infancy. So far, approaches have been limited to ad-hoc algorithms for specific classes of models and/or properties. We propose a unifying framework for the PFV of AI systems based onWeighted Model Integration (WMI), which allows to frame the problem in very general terms. Crucially, this reduction enables the verification of many properties of interest, like fairness, robustness or monotonicity, over a wide range of machine learning models, without making strong distributional assumptions. We support the generality of the approach by solving multiple verification tasks with a single, off-the-shelf WMI solver, then discuss the scalability challenges and research directions related to this promising framework.
Abstract:The development of efficient exact and approximate algorithms for probabilistic inference is a long-standing goal of artificial intelligence research. Whereas substantial progress has been made in dealing with purely discrete or purely continuous domains, adapting the developed solutions to tackle hybrid domains, characterised by discrete and continuous variables and their relationships, is highly non-trivial. Weighted Model Integration (WMI) recently emerged as a unifying formalism for probabilistic inference in hybrid domains. Despite a considerable amount of recent work, allowing WMI algorithms to scale with the complexity of the hybrid problem is still a challenge. In this paper we highlight some substantial limitations of existing state-of-the-art solutions, and develop an algorithm that combines SMT-based enumeration, an efficient technique in formal verification, with an effective encoding of the problem structure. This allows our algorithm to avoid generating redundant models, resulting in drastic computational savings. Additionally, we show how SMT-based approaches can seamlessly deal with different integration techniques, both exact and approximate, significantly expanding the set of problems that can be tackled by WMI technology. An extensive experimental evaluation on both synthetic and real-world datasets confirms the substantial advantage of the proposed solution over existing alternatives. The application potential of this technology is further showcased on a prototypical task aimed at verifying the fairness of probabilistic programs.
Abstract:Weighted Model Integration (WMI) is a popular formalism aimed at unifying approaches for probabilistic inference in hybrid domains, involving logical and algebraic constraints. Despite a considerable amount of recent work, allowing WMI algorithms to scale with the complexity of the hybrid problem is still a challenge. In this paper we highlight some substantial limitations of existing state-of-the-art solutions, and develop an algorithm that combines SMT-based enumeration, an efficient technique in formal verification, with an effective encoding of the problem structure. This allows our algorithm to avoid generating redundant models, resulting in substantial computational savings. An extensive experimental evaluation on both synthetic and real-world datasets confirms the advantage of the proposed solution over existing alternatives.
Abstract:Many procedures for SAT and SAT-related problems -- in particular for those requiring the complete enumeration of satisfying truth assignments -- rely their efficiency on the detection of partial assignments satisfying an input formula. In this paper we analyze the notion of partial-assignment satisfiability -- in particular when dealing with non-CNF and existentially-quantified formulas -- raising a flag about the ambiguities and subtleties of this concept, and investigating their practical consequences. This may drive the development of more effective assignment-enumeration algorithms.
Abstract:Goal models have been widely used in Computer Science to represent software requirements, business objectives, and design qualities. Existing goal modelling techniques, however, have shown limitations of expressiveness and/or tractability in coping with complex real-world problems. In this work, we exploit advances in automated reasoning technologies, notably Satisfiability and Optimization Modulo Theories (SMT/OMT), and we propose and formalize: (i) an extended modelling language for goals, namely the Constrained Goal Model (CGM), which makes explicit the notion of goal refinement and of domain assumption, allows for expressing preferences between goals and refinements, and allows for associating numerical attributes to goals and refinements for defining constraints and optimization goals over multiple objective functions, refinements and their numerical attributes; (ii) a novel set of automated reasoning functionalities over CGMs, allowing for automatically generating suitable refinements of input CGMs, under user-specified assumptions and constraints, that also maximize preferences and optimize given objective functions. We have implemented these modelling and reasoning functionalities in a tool, named CGM-Tool, using the OMT solver OptiMathSAT as automated reasoning backend. Moreover, we have conducted an experimental evaluation on large CGMs to support the claim that our proposal scales well for goal models with thousands of elements.
Abstract:Modelling problems containing a mixture of Boolean and numerical variables is a long-standing interest of Artificial Intelligence. However, performing inference and learning in hybrid domains is a particularly daunting task. The ability to model this kind of domains is crucial in "learning to design" tasks, that is, learning applications where the goal is to learn from examples how to perform automatic {\em de novo} design of novel objects. In this paper we present Structured Learning Modulo Theories, a max-margin approach for learning in hybrid domains based on Satisfiability Modulo Theories, which allows to combine Boolean reasoning and optimization over continuous linear arithmetical constraints. The main idea is to leverage a state-of-the-art generalized Satisfiability Modulo Theory solver for implementing the inference and separation oracles of Structured Output SVMs. We validate our method on artificial and real world scenarios.
Abstract:Generally speaking, the goal of constructive learning could be seen as, given an example set of structured objects, to generate novel objects with similar properties. From a statistical-relational learning (SRL) viewpoint, the task can be interpreted as a constraint satisfaction problem, i.e. the generated objects must obey a set of soft constraints, whose weights are estimated from the data. Traditional SRL approaches rely on (finite) First-Order Logic (FOL) as a description language, and on MAX-SAT solvers to perform inference. Alas, FOL is unsuited for con- structive problems where the objects contain a mixture of Boolean and numerical variables. It is in fact difficult to implement, e.g. linear arithmetic constraints within the language of FOL. In this paper we propose a novel class of hybrid SRL methods that rely on Satisfiability Modulo Theories, an alternative class of for- mal languages that allow to describe, and reason over, mixed Boolean-numerical objects and constraints. The resulting methods, which we call Learning Mod- ulo Theories, are formulated within the structured output SVM framework, and employ a weighted SMT solver as an optimization oracle to perform efficient in- ference and discriminative max margin weight learning. We also present a few examples of constructive learning applications enabled by our method.
Abstract:The problem of finding small unsatisfiable cores for SAT formulas has recently received a lot of interest, mostly for its applications in formal verification. However, propositional logic is often not expressive enough for representing many interesting verification problems, which can be more naturally addressed in the framework of Satisfiability Modulo Theories, SMT. Surprisingly, the problem of finding unsatisfiable cores in SMT has received very little attention in the literature. In this paper we present a novel approach to this problem, called the Lemma-Lifting approach. The main idea is to combine an SMT solver with an external propositional core extractor. The SMT solver produces the theory lemmas found during the search, dynamically lifting the suitable amount of theory information to the Boolean level. The core extractor is then called on the Boolean abstraction of the original SMT problem and of the theory lemmas. This results in an unsatisfiable core for the original SMT problem, once the remaining theory lemmas are removed. The approach is conceptually interesting, and has several advantages in practice. In fact, it is extremely simple to implement and to update, and it can be interfaced with every propositional core extractor in a plug-and-play manner, so as to benefit for free of all unsat-core reduction techniques which have been or will be made available. We have evaluated our algorithm with a very extensive empirical test on SMT-LIB benchmarks, which confirms the validity and potential of this approach.
Abstract:In the last two decades, modal and description logics have been applied to numerous areas of computer science, including knowledge representation, formal verification, database theory, distributed computing and, more recently, semantic web and ontologies. For this reason, the problem of automated reasoning in modal and description logics has been thoroughly investigated. In particular, many approaches have been proposed for efficiently handling the satisfiability of the core normal modal logic K(m), and of its notational variant, the description logic ALC. Although simple in structure, K(m)/ALC is computationally very hard to reason on, its satisfiability being PSPACE-complete. In this paper we start exploring the idea of performing automated reasoning tasks in modal and description logics by encoding them into SAT, so that to be handled by state-of-the-art SAT tools; as with most previous approaches, we begin our investigation from the satisfiability in K(m). We propose an efficient encoding, and we test it on an extensive set of benchmarks, comparing the approach with the main state-of-the-art tools available. Although the encoding is necessarily worst-case exponential, from our experiments we notice that, in practice, this approach can handle most or all the problems which are at the reach of the other approaches, with performances which are comparable with, or even better than, those of the current state-of-the-art tools.