Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Katelyn Zhou

Representing Molecules as Random Walks Over Interpretable Grammars

Mar 13, 2024

Michael Sun, Minghao Guo, Weize Yuan, Veronika Thost, Crystal Elaine Owens, Aristotle Franklin Grosz, Sharvaa Selvan, Katelyn Zhou, Hassan Mohiuddin, Benjamin J Pedretti(+3 more)

Figure 1 for Representing Molecules as Random Walks Over Interpretable Grammars

Figure 2 for Representing Molecules as Random Walks Over Interpretable Grammars

Figure 3 for Representing Molecules as Random Walks Over Interpretable Grammars

Figure 4 for Representing Molecules as Random Walks Over Interpretable Grammars

Abstract:Recent research in molecular discovery has primarily been devoted to small, drug-like molecules, leaving many similarly important applications in material design without adequate technology. These applications often rely on more complex molecular structures with fewer examples that are carefully designed using known substructures. We propose a data-efficient and interpretable model for representing and reasoning over such molecules in terms of graph grammars that explicitly describe the hierarchical design space featuring motifs to be the design basis. We present a novel representation in the form of random walks over the design space, which facilitates both molecule generation and property prediction. We demonstrate clear advantages over existing methods in terms of performance, efficiency, and synthesizability of predicted molecules, and we provide detailed insights into the method's chemical interpretability.

Via

Access Paper or Ask Questions

On ML-Based Program Translation: Perils and Promises

Feb 21, 2023

Aniketh Malyala, Katelyn Zhou, Baishakhi Ray, Saikat Chakraborty

Abstract:With the advent of new and advanced programming languages, it becomes imperative to migrate legacy software to new programming languages. Unsupervised Machine Learning-based Program Translation could play an essential role in such migration, even without a sufficiently sizeable reliable corpus of parallel source code. However, these translators are far from perfect due to their statistical nature. This work investigates unsupervised program translators and where and why they fail. With in-depth error analysis of such failures, we have identified that the cases where such translators fail follow a few particular patterns. With this insight, we develop a rule-based program mutation engine, which pre-processes the input code if the input follows specific patterns and post-process the output if the output follows certain patterns. We show that our code processing tool, in conjunction with the program translator, can form a hybrid program translator and significantly improve the state-of-the-art. In the future, we envision an end-to-end program translation tool where programming domain knowledge can be embedded into an ML-based translation pipeline using pre- and post-processing steps.

* 5 pages, 2 figures. Accepted at ICSE 2023 NIER - New Ideas and Emerging Results

Via

Access Paper or Ask Questions