Abstract:A new semi-supervised machine learning method for the discovery of structure-spectrum relationships is developed and demonstrated using the specific example of interpreting X-ray absorption near-edge structure (XANES) spectra. This method constructs a one-to-one mapping between individual structure descriptors and spectral trends. Specifically, an adversarial autoencoder is augmented with a novel rank constraint (RankAAE). The RankAAE methodology produces a continuous and interpretable latent space, where each dimension can track an individual structure descriptor. As a part of this process, the model provides a robust and quantitative measure of the structure-spectrum relationship by decoupling intertwined spectral contributions from multiple structural characteristics. This makes it ideal for spectral interpretation and the discovery of new descriptors. The capability of this procedure is showcased by considering five local structure descriptors and a database of over fifty thousand simulated XANES spectra across eight first-row transition metal oxide families. The resulting structure-spectrum relationships not only reproduce known trends in the literature, but also reveal unintuitive ones that are visually indiscernible in large data sets. The results suggest that the RankAAE methodology has great potential to assist researchers to interpret complex scientific data, test physical hypotheses, and reveal new patterns that extend scientific insight.