Abstract:Modern machine learning techniques have been extensively applied to materials science, especially for property prediction tasks. A majority of these methods address scalar property predictions, while more challenging spectral properties remain less emphasized. We formulate a crystal-to-sequence learning task and propose a novel attention-based learning method, Xtal2DoS, which decodes the sequential representation of the material density of states (DoS) properties by incorporating the learned atomic embeddings through attention networks. Experiments show Xtal2DoS is faster than the existing models, and consistently outperforms other state-of-the-art methods on four metrics for two fundamental spectral properties, phonon and electronic DoS.
Abstract:We propose a novel exponentially-modified Gaussian (EMG) mixture residual model. The EMG mixture is well suited to model residuals that are contaminated by a distribution with positive support. This is in contrast to commonly used robust residual models, like the Huber loss or $\ell_1$, which assume a symmetric contaminating distribution and are otherwise asymptotically biased. We propose an expectation-maximization algorithm to optimize an arbitrary model with respect to the EMG mixture. We apply the approach to linear regression and probabilistic matrix factorization (PMF). We compare against other residual models, including quantile regression. Our numerical experiments demonstrate the strengths of the EMG mixture on both tasks. The PMF model arises from considering spectroscopic data. In particular, we demonstrate the effectiveness of PMF in conjunction with the EMG mixture model on synthetic data and two real-world applications: X-ray diffraction and Raman spectroscopy. We show how our approach is effective in inferring background signals and systematic errors in data arising from these experimental settings, dramatically outperforming existing approaches and revealing the data's physically meaningful components.
Abstract:Many real-world tasks involve identifying patterns from data satisfying background and prior knowledge, for which the ground truth is not available, but ideal data can be obtained, for example, using theoretical simulations. We propose a novel approach, imitation refinement, which refines imperfect patterns by imitating ideal patterns. The imperfect patterns are obtained for example using an unsupervised learner. Imitation refinement imitates ideal data by incorporating prior knowledge captured by a classifier trained on the ideal data: an imitation refiner applies small modifications to imperfect patterns so that the classifier can identify them. In a sense, imitation refinement fits the data to the classifier, which complements the classical supervised learning task. We show that our imitation refinement approach outperforms existing methods in identifying crystal patterns from X-ray diffraction data in materials discovery. We also show the generality of our approach by illustrating its applicability to a computer vision task.
Abstract:High-Throughput materials discovery involves the rapid synthesis, measurement, and characterization of many different but structurally-related materials. A key problem in materials discovery, the phase map identification problem, involves the determination of the crystal phase diagram from the materials' composition and structural characterization data. We present Phase-Mapper, a novel AI platform to solve the phase map identification problem that allows humans to interact with both the data and products of AI algorithms, including the incorporation of human feedback to constrain or initialize solutions. Phase-Mapper affords incorporation of any spectral demixing algorithm, including our novel solver, AgileFD, which is based on a convolutive non-negative matrix factorization algorithm. AgileFD can incorporate constraints to capture the physics of the materials as well as human feedback. We compare three solver variants with previously proposed methods in a large-scale experiment involving 20 synthetic systems, demonstrating the efficacy of imposing physical constrains using AgileFD. Phase-Mapper has also been used by materials scientists to solve a wide variety of phase diagrams, including the previously unsolved Nb-Mn-V oxide system, which is provided here as an illustrative example.