Abstract:Predicting the yield percentage of a chemical reaction is useful in many aspects such as reducing wet-lab experimentation by giving the priority to the reactions with a high predicted yield. In this work we investigated the use of multiple type inputs to predict chemical reaction yield. We used simplified molecular-input line-entry system (SMILES) as well as calculated chemical descriptors as model inputs. The model consists of a pre-trained bidirectional transformer-based encoder (BERT) and a multi-layer perceptron (MLP) with a regression head to predict the yield. We experimented on two high throughput experimentation (HTE) datasets for Buchwald-Hartwig and Suzuki-Miyaura reactions. The experiments show improvements in the prediction on both datasets compared to systems using only SMILES or chemical descriptors as input. We also tested the model's performance on out-of-sample dataset splits of Buchwald-Hartwig and achieved comparable results with the state-of-the-art. In addition to predicting the yield, we demonstrated the model's ability to suggest the optimum (highest yield) reaction conditions. The model was able to suggest conditions that achieves 94% of the optimum reported yields. This proves the model to be useful in achieving the best results in the wet lab without expensive experimentation.
Abstract:Glacier mapping is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated mapping from satellite images. We utilize readily available remote sensing data to create a model to identify and outline both clean ice and debris-covered glaciers from satellite imagery. We also release data and develop a web tool that allows experts to visualize and correct model predictions, with the ultimate aim of accelerating the glacier mapping process.