Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dominik G. Grimm

GraphXForm: Graph transformer for computer-aided molecular design with application to extraction

Nov 03, 2024

Jonathan Pirnay, Jan G. Rittig, Alexander B. Wolf, Martin Grohe, Jakob Burger, Alexander Mitsos, Dominik G. Grimm

Abstract:Generative deep learning has become pivotal in molecular design for drug discovery and materials science. A widely used paradigm is to pretrain neural networks on string representations of molecules and fine-tune them using reinforcement learning on specific objectives. However, string-based models face challenges in ensuring chemical validity and enforcing structural constraints like the presence of specific substructures. We propose to instead combine graph-based molecular representations, which can naturally ensure chemical validity, with transformer architectures, which are highly expressive and capable of modeling long-range dependencies between atoms. Our approach iteratively modifies a molecular graph by adding atoms and bonds, which ensures chemical validity and facilitates the incorporation of structural constraints. We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned using a new training algorithm that combines elements of the deep cross-entropy method with self-improvement learning from language modeling, allowing stable fine-tuning of deep transformers with many layers. We evaluate GraphXForm on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques, while it can flexibly enforce structural constraints or initiate the design from existing molecular structures.

Via

Access Paper or Ask Questions

Take a Step and Reconsider: Sequence Decoding for Self-Improved Neural Combinatorial Optimization

Jul 24, 2024

Jonathan Pirnay, Dominik G. Grimm

Abstract:The constructive approach within Neural Combinatorial Optimization (NCO) treats a combinatorial optimization problem as a finite Markov decision process, where solutions are built incrementally through a sequence of decisions guided by a neural policy network. To train the policy, recent research is shifting toward a 'self-improved' learning methodology that addresses the limitations of reinforcement learning and supervised approaches. Here, the policy is iteratively trained in a supervised manner, with solutions derived from the current policy serving as pseudo-labels. The way these solutions are obtained from the policy determines the quality of the pseudo-labels. In this paper, we present a simple and problem-independent sequence decoding method for self-improved learning based on sampling sequences without replacement. We incrementally follow the best solution found and repeat the sampling process from intermediate partial solutions. By modifying the policy to ignore previously sampled sequences, we force it to consider only unseen alternatives, thereby increasing solution diversity. Experimental results for the Traveling Salesman and Capacitated Vehicle Routing Problem demonstrate its strong performance. Furthermore, our method outperforms previous NCO approaches on the Job Shop Scheduling Problem.

* Accepted at ECAI-2024

Via

Access Paper or Ask Questions

Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement

Mar 22, 2024

Jonathan Pirnay, Dominik G. Grimm

Abstract:Current methods for end-to-end constructive neural combinatorial optimization usually train a policy using behavior cloning from expert solutions or policy gradient methods from reinforcement learning. While behavior cloning is straightforward, it requires expensive expert solutions, and policy gradient methods are often computationally demanding and complex to fine-tune. In this work, we bridge the two and simplify the training process by sampling multiple solutions for random instances using the current model in each epoch and then selecting the best solution as an expert trajectory for supervised imitation learning. To achieve progressively improving solutions with minimal sampling, we introduce a method that combines round-wise Stochastic Beam Search with an update strategy derived from a provable policy improvement. This strategy refines the policy between rounds by utilizing the advantage of the sampled sequences with almost no computational overhead. We evaluate our approach on the Traveling Salesman Problem and the Capacitated Vehicle Routing Problem. The models trained with our method achieve comparable performance and generalization to those trained with expert data. Additionally, we apply our method to the Job Shop Scheduling Problem using a transformer-based architecture and outperform existing state-of-the-art methods by a wide margin.

Via

Access Paper or Ask Questions

Deep reinforcement learning uncovers processes for separating azeotropic mixtures without prior knowledge

Oct 10, 2023

Quirin Göttl, Jonathan Pirnay, Jakob Burger, Dominik G. Grimm

Figure 1 for Deep reinforcement learning uncovers processes for separating azeotropic mixtures without prior knowledge

Figure 2 for Deep reinforcement learning uncovers processes for separating azeotropic mixtures without prior knowledge

Figure 3 for Deep reinforcement learning uncovers processes for separating azeotropic mixtures without prior knowledge

Figure 4 for Deep reinforcement learning uncovers processes for separating azeotropic mixtures without prior knowledge

Abstract:Process synthesis in chemical engineering is a complex planning problem due to vast search spaces, continuous parameters and the need for generalization. Deep reinforcement learning agents, trained without prior knowledge, have shown to outperform humans in various complex planning problems in recent years. Existing work on reinforcement learning for flowsheet synthesis shows promising concepts, but focuses on narrow problems in a single chemical system, limiting its practicality. We present a general deep reinforcement learning approach for flowsheet synthesis. We demonstrate the adaptability of a single agent to the general task of separating binary azeotropic mixtures. Without prior knowledge, it learns to craft near-optimal flowsheets for multiple chemical systems, considering different feed compositions and conceptual approaches. On average, the agent can separate more than 99% of the involved materials into pure components, while autonomously learning fundamental process engineering paradigms. This highlights the agent's planning flexibility, an encouraging step toward true generality.

* 36 pages, 7 figures, 4 tables. G\"ottl and Pirnay contributed equally as joint first authors. Burger and Grimm contributed equally as joint last authors

Via

Access Paper or Ask Questions

EVARS-GPR: EVent-triggered Augmented Refitting of Gaussian Process Regression for Seasonal Data

Jul 06, 2021

Florian Haselbeck, Dominik G. Grimm

Figure 1 for EVARS-GPR: EVent-triggered Augmented Refitting of Gaussian Process Regression for Seasonal Data

Figure 2 for EVARS-GPR: EVent-triggered Augmented Refitting of Gaussian Process Regression for Seasonal Data

Figure 3 for EVARS-GPR: EVent-triggered Augmented Refitting of Gaussian Process Regression for Seasonal Data

Figure 4 for EVARS-GPR: EVent-triggered Augmented Refitting of Gaussian Process Regression for Seasonal Data

Abstract:Time series forecasting is a growing domain with diverse applications. However, changes of the system behavior over time due to internal or external influences are challenging. Therefore, predictions of a previously learned fore-casting model might not be useful anymore. In this paper, we present EVent-triggered Augmented Refitting of Gaussian Process Regression for Seasonal Data (EVARS-GPR), a novel online algorithm that is able to handle sudden shifts in the target variable scale of seasonal data. For this purpose, EVARS-GPR com-bines online change point detection with a refitting of the prediction model using data augmentation for samples prior to a change point. Our experiments on sim-ulated data show that EVARS-GPR is applicable for a wide range of output scale changes. EVARS-GPR has on average a 20.8 % lower RMSE on different real-world datasets compared to methods with a similar computational resource con-sumption. Furthermore, we show that our algorithm leads to a six-fold reduction of the averaged runtime in relation to all comparison partners with a periodical refitting strategy. In summary, we present a computationally efficient online fore-casting algorithm for seasonal time series with changes of the target variable scale and demonstrate its functionality on simulated as well as real-world data. All code is publicly available on GitHub: https://github.com/grimmlab/evars-gpr.

Via

Access Paper or Ask Questions

Automated Synthesis of Steady-State Continuous Processes using Reinforcement Learning

Jan 12, 2021

Quirin Göttl, Dominik G. Grimm, Jakob Burger

Figure 1 for Automated Synthesis of Steady-State Continuous Processes using Reinforcement Learning

Figure 2 for Automated Synthesis of Steady-State Continuous Processes using Reinforcement Learning

Figure 3 for Automated Synthesis of Steady-State Continuous Processes using Reinforcement Learning

Figure 4 for Automated Synthesis of Steady-State Continuous Processes using Reinforcement Learning

Abstract:Automated flowsheet synthesis is an important field in computer-aided process engineering. The present work demonstrates how reinforcement learning (RL) can be used for automated flowsheet synthesis without any heuristics of prior knowledge of conceptual design. The environment consists of a steady-state flowsheet simulator that contains all physical knowledge. An agent is trained to take discrete actions and sequentially built up flowsheets that solve a given process problem. A novel RL method named SynGameZero is developed to ensure good exploration schemes in the complex problem. Therein, flowsheet synthesis is modelled as a game of two competing players. The RL agent plays this game against itself during training and consists of an artificial neural network and a tree search for forward planning. The method is applied successfully to a reaction-distillation process in a quaternary system.

Via

Access Paper or Ask Questions