Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vineet Thumuluri

Technical report: Improving the properties of molecules generated by LIMO

Jul 20, 2024

Vineet Thumuluri, Peter Eckmann, Michael K. Gilson, Rose Yu

Abstract:This technical report investigates variants of the Latent Inceptionism on Molecules (LIMO) framework to improve the properties of generated molecules. We conduct ablative studies of molecular representation, decoder model, and surrogate model training scheme. The experiments suggest that an autogressive Transformer decoder with GroupSELFIES achieves the best average properties for the random generation task.

* 9 pages, 2 figures

Via

Access Paper or Ask Questions

Chain-of-Thought Predictive Control

Apr 03, 2023

Zhiwei Jia, Fangchen Liu, Vineet Thumuluri, Linghao Chen, Zhiao Huang, Hao Su

Figure 1 for Chain-of-Thought Predictive Control

Figure 2 for Chain-of-Thought Predictive Control

Figure 3 for Chain-of-Thought Predictive Control

Figure 4 for Chain-of-Thought Predictive Control

Abstract:We study generalizable policy learning from demonstrations for complex low-level control tasks (e.g., contact-rich object manipulations). We propose an imitation learning method that incorporates the idea of temporal abstraction and the planning capabilities from Hierarchical RL (HRL) in a novel and effective manner. As a step towards decision foundation models, our design can utilize scalable, albeit highly sub-optimal, demonstrations. Specifically, we find certain short subsequences of the demos, i.e. the chain-of-thought (CoT), reflect their hierarchical structures by marking the completion of subgoals in the tasks. Our model learns to dynamically predict the entire CoT as coherent and structured long-term action guidance and consistently outperforms typical two-stage subgoal-conditioned policies. On the other hand, such CoT facilitates generalizable policy learning as they exemplify the decision patterns shared among demos (even those with heavy noises and randomness). Our method, Chain-of-Thought Predictive Control (CoTPC), significantly outperforms existing ones on challenging low-level manipulation tasks from scalable yet highly sub-optimal demos.

* Project page at https://zjia.eng.ucsd.edu/cotpc

Via

Access Paper or Ask Questions