Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenzhi Xiao

Supervised Pretraining for Molecular Force Fields and Properties Prediction

Nov 23, 2022

Xiang Gao, Weihao Gao, Wenzhi Xiao, Zhirui Wang, Chong Wang, Liang Xiang

Figure 1 for Supervised Pretraining for Molecular Force Fields and Properties Prediction

Figure 2 for Supervised Pretraining for Molecular Force Fields and Properties Prediction

Figure 3 for Supervised Pretraining for Molecular Force Fields and Properties Prediction

Figure 4 for Supervised Pretraining for Molecular Force Fields and Properties Prediction

Abstract:Machine learning approaches have become popular for molecular modeling tasks, including molecular force fields and properties prediction. Traditional supervised learning methods suffer from scarcity of labeled data for particular tasks, motivating the use of large-scale dataset for other relevant tasks. We propose to pretrain neural networks on a dataset of 86 millions of molecules with atom charges and 3D geometries as inputs and molecular energies as labels. Experiments show that, compared to training from scratch, fine-tuning the pretrained model can significantly improve the performance for seven molecular property prediction tasks and two force field tasks. We also demonstrate that the learned representations from the pretrained model contain adequate information about molecular structures, by showing that linear probing of the representations can predict many molecular information including atom types, interatomic distances, class of molecular scaffolds, and existence of molecular fragments. Our results show that supervised pretraining is a promising research direction in molecular modeling

* AI4Science Workshop at NeurIPS 2022

Via

Access Paper or Ask Questions

Learning Regularized Positional Encoding for Molecular Prediction

Nov 23, 2022

Xiang Gao, Weihao Gao, Wenzhi Xiao, Zhirui Wang, Chong Wang, Liang Xiang

Figure 1 for Learning Regularized Positional Encoding for Molecular Prediction

Figure 2 for Learning Regularized Positional Encoding for Molecular Prediction

Figure 3 for Learning Regularized Positional Encoding for Molecular Prediction

Figure 4 for Learning Regularized Positional Encoding for Molecular Prediction

Abstract:Machine learning has become a promising approach for molecular modeling. Positional quantities, such as interatomic distances and bond angles, play a crucial role in molecule physics. The existing works rely on careful manual design of their representation. To model the complex nonlinearity in predicting molecular properties in an more end-to-end approach, we propose to encode the positional quantities with a learnable embedding that is continuous and differentiable. A regularization technique is employed to encourage embedding smoothness along the physical dimension. We experiment with a variety of molecular property and force field prediction tasks. Improved performance is observed for three different model architectures after plugging in the proposed positional encoding method. In addition, the learned positional encoding allows easier physics-based interpretation. We observe that tasks of similar physics have the similar learned positional encoding.

* AI4Science Workshop at NeurIPS 2022

Via

Access Paper or Ask Questions

Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

Jul 12, 2020

Weihao Gao, Xiangjun Fan, Jiankai Sun, Kai Jia, Wenzhi Xiao, Chong Wang, Xiaobing Liu

Figure 1 for Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

Figure 2 for Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

Figure 3 for Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

Figure 4 for Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

Abstract:One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model and then use maximum inner product search (MIPS) algorithms to search top candidates, leading to potential loss of retrieval accuracy. In this paper, we present Deep Retrieval (DR), an end-to-end learnable structure model for large-scale recommendations. DR encodes all candidates into a discrete latent space. Those latent codes for the candidates are model parameters and to be learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the latent codes is performed to retrieve the top candidates. Empirically, we showed that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline.

* 13 pages, 4 figures

Via

Access Paper or Ask Questions