Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Erin Antono

Machine-learned metrics for predicting the likelihood of success in materials discovery

Nov 27, 2019

Yoolhee Kim, Edward Kim, Erin Antono, Bryce Meredig, Julia Ling

Figure 1 for Machine-learned metrics for predicting the likelihood of success in materials discovery

Figure 2 for Machine-learned metrics for predicting the likelihood of success in materials discovery

Figure 3 for Machine-learned metrics for predicting the likelihood of success in materials discovery

Figure 4 for Machine-learned metrics for predicting the likelihood of success in materials discovery

Abstract:Materials discovery is often compared to the challenge of finding a needle in a haystack. While much work has focused on accurately predicting the properties of candidate materials with machine learning (ML), which amounts to evaluating whether a given candidate is a piece of straw or a needle, less attention has been paid to a critical question: Are we searching in the right haystack? We refer to the haystack as the design space for a particular materials discovery problem (i.e. the set of possible candidate materials to synthesize), and thus frame this question as one of design space selection. In this paper, we introduce two metrics, the Predicted Fraction of Improved Candidates (PFIC), and the Cumulative Maximum Likelihood of Improvement (CMLI), which we demonstrate can identify discovery-rich and discovery-poor design spaces, respectively. Using CMLI and PFIC together to identify optimal design spaces can significantly accelerate ML-driven materials discovery.

* 13 pages, 10 figures, 2 tables

Via

Access Paper or Ask Questions

Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization

Nov 06, 2019

Zachary del Rosario, Yoolhee Kim, Matthias Rupp, Erin Antono, Julia Ling

Figure 1 for Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization

Figure 2 for Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization

Figure 3 for Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization

Figure 4 for Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization

Abstract:Discovering novel materials can be greatly accelerated by iterative machine learning-informed proposal of candidates---active learning. However, standard \emph{global-scope error} metrics for model quality are not predictive of discovery performance, and can be misleading. We introduce the notion of \emph{Pareto shell-scope error} to help judge the suitability of a model for proposing material candidates. Further, through synthetic cases and a thermoelectric dataset, we probe the relation between acquisition function fidelity and active learning performance. Results suggest novel diagnostic tools, as well as new insights for acquisition function design.

* 19 pages, 24 figures

Via

Access Paper or Ask Questions

Overcoming data scarcity with transfer learning

Nov 02, 2017

Maxwell L. Hutchinson, Erin Antono, Brenna M. Gibbons, Sean Paradiso, Julia Ling, Bryce Meredig

Figure 1 for Overcoming data scarcity with transfer learning

Figure 2 for Overcoming data scarcity with transfer learning

Figure 3 for Overcoming data scarcity with transfer learning

Figure 4 for Overcoming data scarcity with transfer learning

Abstract:Despite increasing focus on data publication and discovery in materials science and related fields, the global view of materials data is highly sparse. This sparsity encourages training models on the union of multiple datasets, but simple unions can prove problematic as (ostensibly) equivalent properties may be measured or computed differently depending on the data source. These hidden contextual differences introduce irreducible errors into analyses, fundamentally limiting their accuracy. Transfer learning, where information from one dataset is used to inform a model on another, can be an effective tool for bridging sparse data while preserving the contextual differences in the underlying measurements. Here, we describe and compare three techniques for transfer learning: multi-task, difference, and explicit latent variable architectures. We show that difference architectures are most accurate in the multi-fidelity case of mixed DFT and experimental band gaps, while multi-task most improves classification performance of color with band gaps. For activation energies of steps in NO reduction, the explicit latent variable method is not only the most accurate, but also enjoys cancellation of errors in functions that depend on multiple tasks. These results motivate the publication of high quality materials datasets that encode transferable information, independent of industrial or academic interest in the particular labels, and encourage further development and application of transfer learning methods to materials informatics problems.

Via

Access Paper or Ask Questions

Building Data-driven Models with Microstructural Images: Generalization and Interpretability

Nov 01, 2017

Julia Ling, Maxwell Hutchinson, Erin Antono, Brian DeCost, Elizabeth A. Holm, Bryce Meredig

Figure 1 for Building Data-driven Models with Microstructural Images: Generalization and Interpretability

Figure 2 for Building Data-driven Models with Microstructural Images: Generalization and Interpretability

Figure 3 for Building Data-driven Models with Microstructural Images: Generalization and Interpretability

Figure 4 for Building Data-driven Models with Microstructural Images: Generalization and Interpretability

Abstract:As data-driven methods rise in popularity in materials science applications, a key question is how these machine learning models can be used to understand microstructure. Given the importance of process-structure-property relations throughout materials science, it seems logical that models that can leverage microstructural data would be more capable of predicting property information. While there have been some recent attempts to use convolutional neural networks to understand microstructural images, these early studies have focused only on which featurizations yield the highest machine learning model accuracy for a single data set. This paper explores the use of convolutional neural networks for classifying microstructure with a more holistic set of objectives in mind: generalization between data sets, number of features required, and interpretability.

Via

Access Paper or Ask Questions

High-Dimensional Materials and Process Optimization using Data-driven Experimental Design with Well-Calibrated Uncertainty Estimates

Jul 04, 2017

Julia Ling, Max Hutchinson, Erin Antono, Sean Paradiso, Bryce Meredig

Figure 1 for High-Dimensional Materials and Process Optimization using Data-driven Experimental Design with Well-Calibrated Uncertainty Estimates

Figure 2 for High-Dimensional Materials and Process Optimization using Data-driven Experimental Design with Well-Calibrated Uncertainty Estimates

Figure 3 for High-Dimensional Materials and Process Optimization using Data-driven Experimental Design with Well-Calibrated Uncertainty Estimates

Figure 4 for High-Dimensional Materials and Process Optimization using Data-driven Experimental Design with Well-Calibrated Uncertainty Estimates

Abstract:The optimization of composition and processing to obtain materials that exhibit desirable characteristics has historically relied on a combination of scientist intuition, trial and error, and luck. We propose a methodology that can accelerate this process by fitting data-driven models to experimental data as it is collected to suggest which experiment should be performed next. This methodology can guide the scientist to test the most promising candidates earlier, and can supplement scientific intuition and knowledge with data-driven insights. A key strength of the proposed framework is that it scales to high-dimensional parameter spaces, as are typical in materials discovery applications. Importantly, the data-driven models incorporate uncertainty analysis, so that new experiments are proposed based on a combination of exploring high-uncertainty candidates and exploiting high-performing regions of parameter space. Over four materials science test cases, our methodology led to the optimal candidate being found with three times fewer required measurements than random guessing on average.

* Integrating Materials and Manufacturing Innovation, (2017)

Via

Access Paper or Ask Questions