Abstract:Polymers are widely-studied materials with diverse properties and applications determined by different molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches are unable to offer comprehensive design models for polymers because of their inherent scale and structural complexity. Here, we present a parametric, context-sensitive grammar designed specifically for the representation and generation of polymers. As a demonstrative example, we implement our grammar for polyurethanes. Using our symbolic hypergraph representation and 14 simple production rules, our PolyGrammar is able to represent and generate all valid polyurethane structures. We also present an algorithm to translate any polyurethane structure from the popular SMILES string format into our PolyGrammar representation. We test the representative power of PolyGrammar by translating a dataset of over 600 polyurethane samples collected from literature. Furthermore, we show that PolyGrammar can be easily extended to the other copolymers and homopolymers such as polyacrylates. By offering a complete, explicit representation scheme and an explainable generative model with validity guarantees, our PolyGrammar takes an important step toward a more comprehensive and practical system for polymer discovery and exploration. As the first bridge between formal languages and chemistry, PolyGrammar also serves as a critical blueprint to inform the design of similar grammars for other chemistries, including organic and inorganic molecules.
Abstract:We present AutoOED, an Optimal Experiment Design platform powered with automated machine learning to accelerate the discovery of optimal solutions. The platform solves multi-objective optimization problems in time- and data-efficient manner by automatically guiding the design of experiments to be evaluated. To automate the optimization process, we implement several multi-objective Bayesian optimization algorithms with state-of-the-art performance. AutoOED is open-source and written in Python. The codebase is modular, facilitating extensions and tailoring the code, serving as a testbed for machine learning researchers to easily develop and evaluate their own multi-objective Bayesian optimization algorithms. An intuitive graphical user interface (GUI) is provided to visualize and guide the experiments for users with little or no experience with coding, machine learning, or optimization. Furthermore, a distributed system is integrated to enable parallelized experimental evaluations by independent workers in remote locations. The platform is available at https://autooed.org.