Abstract:While the uptake of data-driven approaches for materials science and chemistry is at an exciting, early stage, to realise the true potential of machine learning models for successful scientific discovery, they must have qualities beyond purely predictive power. The predictions and inner workings of models should provide a certain degree of explainability by human experts, permitting the identification of potential model issues or limitations, building trust on model predictions and unveiling unexpected correlations that may lead to scientific insights. In this work, we summarize applications of interpretability and explainability techniques for materials science and chemistry and discuss how these techniques can improve the outcome of scientific studies. We discuss various challenges for interpretable machine learning in materials science and, more broadly, in scientific settings. In particular, we emphasize the risks of inferring causation or reaching generalization by purely interpreting machine learning models and the need of uncertainty estimates for model explanations. Finally, we showcase a number of exciting developments in other fields that could benefit interpretability in material science and chemistry problems.
Abstract:Many clinical workflows depend on interactive computer systems for highly technical, conceptual work products, such as diagnoses, treatment plans, care coordination, and case management. We describe an automatic logic reasoner to verify objective specifications for these highly technical, but abstract, work products that are essential to care. The conceptual work products specifications serve as a fundamental output requirement, which must be clearly stated, correct and solvable. There is strategic importance for such specifications because, in turn, they enable system model checking to verify that machine functions taken with user procedures are actually able to achieve these abstract products. We chose case management of Multiple Sclerosis (MS) outpatients as our use case for its challenging complexity. As a first step, we illustrate how graphical class and state diagrams from UML can be developed and critiqued with subject matter experts to serve as specifications of the conceptual work product of case management. A key feature is that the specification must be declarative and thus independent of any process or technology. Our Work Domain Ontology with tools from Semantic Web is needed to translate UML class and state diagrams for verification of solvability with automatic reasoning. The solvable model will then be ready for subsequent use with model checking on the system of human procedures and machine functions. We used the expressive rule language SPARQL Inferencing Notation (SPIN) to develop formal representations of the UML class diagram, the state machine, and their interactions. Using SPIN, we proved the consistency of the interactions of static and dynamic concepts. We discussed how the new SPIN rule engine could be incorporated in the Object Management Group (OMG) Ontology Definition Metamodel (ODM)
Abstract:This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory site at Harwell near Oxford. Such "Big Scientific Data" comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility, and the UK's Central Laser Facility. Increasingly, scientists are now needing to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and also to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now also used deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, they have been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the Rutherford Appleton Laboratory, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from a number of different scientific domains. We conclude with some initial examples of our "SciML" benchmark suite and of the research challenges these benchmarks will enable.