Abstract:Applying AI power to predict syntheses of novel materials requires high-quality, large-scale datasets. Extraction of synthesis information from scientific publications is still challenging, especially for extracting synthesis actions, because of the lack of a comprehensive labeled dataset using a solid, robust, and well-established ontology for describing synthesis procedures. In this work, we propose the first Unified Language of Synthesis Actions (ULSA) for describing ceramics synthesis procedures. We created a dataset of 3,040 synthesis procedures annotated by domain experts according to the proposed ULSA scheme. To demonstrate the capabilities of ULSA, we built a neural network-based model to map arbitrary ceramics synthesis paragraphs into ULSA and used it to construct synthesis flowcharts for synthesis procedures. Analysis for the flowcharts showed that (a) ULSA covers essential vocabulary used by researchers when describing synthesis procedures and (b) it can capture important features of synthesis protocols. This work is an important step towards creating a synthesis ontology and a solid foundation for autonomous robotic synthesis.