Abstract:Powder X-ray diffraction analysis is a critical component of materials characterization methodologies. Discerning characteristic Bragg intensity peaks and assigning them to known crystalline phases is the first qualitative step of evaluating diffraction spectra. Subsequent to phase identification, Rietveld refinement may be employed to extract the abundance of quantitative, material-specific parameters hidden within powder data. These characterization procedures are yet time-consuming and inhibit efficiency in materials science workflows. The ever-increasing popularity and propulsion of data science techniques has provided an obvious solution on the course towards materials analysis automation. Deep learning has become a prime focus for predicting crystallographic parameters and features from X-ray spectra. However, the infeasibility of curating large, well-labelled experimental datasets means that one must resort to a large number of theoretic simulations for powder data augmentation to effectively train deep models. Herein, we are interested in conventional supervised learning algorithms in lieu of deep learning for multi-label crystalline phase identification and quantitative phase analysis for a biomedical application. First, models were trained using very limited experimental data. Further, we incorporated simulated XRD data to assess model generalizability as well as the efficacy of simulation-based training for predictive analysis in a real-world X-ray diffraction application.
Abstract:In powder diffraction data analysis, phase identification is the process of determining the crystalline phases in a sample using its characteristic Bragg peaks. For multiphasic spectra, we must also determine the relative weight fraction of each phase in the sample. Machine Learning algorithms (e.g., Artificial Neural Networks) have been applied to perform such difficult tasks in powder diffraction analysis, but typically require a significant number of training samples for acceptable performance. We have developed an approach that performs well even with a small number of training samples. We apply a fixed-point iteration algorithm on the labelled training samples to estimate monophasic spectra. Then, given an unknown sample spectrum, we again use a fixed-point iteration algorithm to determine the weighted combination of monophase spectra that best approximates the unknown sample spectrum. These weights are the desired phase fractions for the sample. We compare our approach with several traditional Machine Learning algorithms.