Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sungbin Lim

Department of Statistics, Korea University, LG AI Research

Probability-Flow ODE in Infinite-Dimensional Function Spaces

Mar 13, 2025

Kunwoo Na, Junghyun Lee, Se-Young Yun, Sungbin Lim

Abstract:Recent advances in infinite-dimensional diffusion models have demonstrated their effectiveness and scalability in function generation tasks where the underlying structure is inherently infinite-dimensional. To accelerate inference in such models, we derive, for the first time, an analog of the probability-flow ODE (PF-ODE) in infinite-dimensional function spaces. Leveraging this newly formulated PF-ODE, we reduce the number of function evaluations while maintaining sample quality in function generation tasks, including applications to PDEs.

* 26 pages, 8 figures. Accepted to the ICLR 2025 DeLTa Workshop

Via

Access Paper or Ask Questions

Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization

Feb 05, 2025

Chanhui Lee, Yuheon Song, YongJun Jeong, Hanbum Ko, Rodrigo Hormazabal, Sehui Han, Kyunghoon Bae, Sungbin Lim, Sungwoong Kim

Figure 1 for Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization

Figure 2 for Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization

Figure 3 for Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization

Figure 4 for Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization

Abstract:Recent advances in Large Language Models (LLMs) have motivated the development of general LLMs for molecular tasks. While several studies have demonstrated that fine-tuned LLMs can achieve impressive benchmark performances, they are far from genuine generalist molecular LLMs due to a lack of fundamental understanding of molecular structure. Specifically, when given molecular task instructions, LLMs trained with naive next-token prediction training assign similar likelihood scores to both original and negatively corrupted molecules, revealing their lack of molecular structure understanding that is crucial for reliable and general molecular LLMs. To overcome this limitation and obtain a true generalist molecular LLM, we introduce a novel multi-modal training method based on a thorough multi-modal instruction tuning as well as a molecular structure preference optimization between chosen and rejected graphs. On various molecular benchmarks, the proposed generalist molecular LLM, called Mol-LLM, achieves state-of-the-art performances among generalist LLMs on most tasks, at the same time, surpassing or comparable to state-of-the-art specialist LLMs. Moreover, Mol-LLM also shows superior generalization performances in reaction prediction tasks, demonstrating the effect of the molecular structure understanding for generalization perspective.

Via

Access Paper or Ask Questions

Scalable Multi-Task Transfer Learning for Molecular Property Prediction

Oct 01, 2024

Chanhui Lee, Dae-Woong Jeong, Sung Moon Ko, Sumin Lee, Hyunseung Kim, Soorin Yim, Sehui Han, Sungwoong Kim, Sungbin Lim

Figure 1 for Scalable Multi-Task Transfer Learning for Molecular Property Prediction

Figure 2 for Scalable Multi-Task Transfer Learning for Molecular Property Prediction

Figure 3 for Scalable Multi-Task Transfer Learning for Molecular Property Prediction

Figure 4 for Scalable Multi-Task Transfer Learning for Molecular Property Prediction

Abstract:Molecules have a number of distinct properties whose importance and application vary. Often, in reality, labels for some properties are hard to achieve despite their practical importance. A common solution to such data scarcity is to use models of good generalization with transfer learning. This involves domain experts for designing source and target tasks whose features are shared. However, this approach has limitations: i). Difficulty in accurate design of source-target task pairs due to the large number of tasks, and ii). corresponding computational burden verifying many trials and errors of transfer learning design, thereby iii). constraining the potential of foundation modeling of multi-task molecular property prediction. We address the limitations of the manual design of transfer learning via data-driven bi-level optimization. The proposed method enables scalable multi-task transfer learning for molecular property prediction by automatically obtaining the optimal transfer ratios. Empirically, the proposed method improved the prediction performance of 40 molecular properties and accelerated training convergence.

* ICML2024-AI4Science Poster

Via

Access Paper or Ask Questions

Stochastic Optimal Control for Diffusion Bridges in Function Spaces

Jun 03, 2024

Byoungwoo Park, Jungwon Choi, Sungbin Lim, Juho Lee

Abstract:Recent advancements in diffusion models and diffusion bridges primarily focus on finite-dimensional spaces, yet many real-world problems necessitate operations in infinite-dimensional function spaces for more natural and interpretable formulations. In this paper, we present a theory of stochastic optimal control (SOC) tailored to infinite-dimensional spaces, aiming to extend diffusion-based algorithms to function spaces. Specifically, we demonstrate how Doob's $h$-transform, the fundamental tool for constructing diffusion bridges, can be derived from the SOC perspective and expanded to infinite dimensions. This expansion presents a challenge, as infinite-dimensional spaces typically lack closed-form densities. Leveraging our theory, we establish that solving the optimal control problem with a specific objective function choice is equivalent to learning diffusion-based generative models. We propose two applications: (1) learning bridges between two infinite-dimensional distributions and (2) generative models for sampling from an infinite-dimensional distribution. Our approach proves effective for diverse problems involving continuous function space representations, such as resolution-free images, time-series data, and probability density functions.

Via

Access Paper or Ask Questions

Can We Utilize Pre-trained Language Models within Causal Discovery Algorithms?

Nov 19, 2023

Chanhui Lee, Juhyeon Kim, Yongjun Jeong, Juhyun Lyu, Junghee Kim, Sangmin Lee, Sangjun Han, Hyeokjun Choe, Soyeon Park, Woohyung Lim(+2 more)

Abstract:Scaling laws have allowed Pre-trained Language Models (PLMs) into the field of causal reasoning. Causal reasoning of PLM relies solely on text-based descriptions, in contrast to causal discovery which aims to determine the causal relationships between variables utilizing data. Recently, there has been current research regarding a method that mimics causal discovery by aggregating the outcomes of repetitive causal reasoning, achieved through specifically designed prompts. It highlights the usefulness of PLMs in discovering cause and effect, which is often limited by a lack of data, especially when dealing with multiple variables. Conversely, the characteristics of PLMs which are that PLMs do not analyze data and they are highly dependent on prompt design leads to a crucial limitation for directly using PLMs in causal discovery. Accordingly, PLM-based causal reasoning deeply depends on the prompt design and carries out the risk of overconfidence and false predictions in determining causal relationships. In this paper, we empirically demonstrate the aforementioned limitations of PLM-based causal reasoning through experiments on physics-inspired synthetic data. Then, we propose a new framework that integrates prior knowledge obtained from PLM with a causal discovery algorithm. This is accomplished by initializing an adjacency matrix for causal discovery and incorporating regularization using prior knowledge. Our proposed framework not only demonstrates improved performance through the integration of PLM and causal discovery but also suggests how to leverage PLM-extracted prior knowledge with existing causal discovery algorithms.

Via

Access Paper or Ask Questions

Threshold-aware Learning to Generate Feasible Solutions for Mixed Integer Programs

Aug 01, 2023

Taehyun Yoon, Jinwon Choi, Hyokun Yun, Sungbin Lim

Abstract:Finding a high-quality feasible solution to a combinatorial optimization (CO) problem in a limited time is challenging due to its discrete nature. Recently, there has been an increasing number of machine learning (ML) methods for addressing CO problems. Neural diving (ND) is one of the learning-based approaches to generating partial discrete variable assignments in Mixed Integer Programs (MIP), a framework for modeling CO problems. However, a major drawback of ND is a large discrepancy between the ML and MIP objectives, i.e., variable value classification accuracy over primal bound. Our study investigates that a specific range of variable assignment rates (coverage) yields high-quality feasible solutions, where we suggest optimizing the coverage bridges the gap between the learning and MIP objectives. Consequently, we introduce a post-hoc method and a learning-based approach for optimizing the coverage. A key idea of our approach is to jointly learn to restrict the coverage search space and to predict the coverage in the learned search space. Experimental results demonstrate that learning a deep neural network to estimate the coverage for finding high-quality feasible solutions achieves state-of-the-art performance in NeurIPS ML4CO datasets. In particular, our method shows outstanding performance in the workload apportionment dataset, achieving the optimality gap of 0.45%, a ten-fold improvement over SCIP within the one-minute time limit.

Via

Access Paper or Ask Questions

Bag of Tricks for In-Distribution Calibration of Pretrained Transformers

Feb 13, 2023

Jaeyoung Kim, Dongbin Na, Sungchul Choi, Sungbin Lim

Abstract:While pre-trained language models (PLMs) have become a de-facto standard promoting the accuracy of text classification tasks, recent studies find that PLMs often predict over-confidently. Although various calibration methods have been proposed, such as ensemble learning and data augmentation, most of the methods have been verified in computer vision benchmarks rather than in PLM-based text classification tasks. In this paper, we present an empirical study on confidence calibration for PLMs, addressing three categories, including confidence penalty losses, data augmentations, and ensemble methods. We find that the ensemble model overfitted to the training set shows sub-par calibration performance and also observe that PLMs trained with confidence penalty loss have a trade-off between calibration and accuracy. Building on these observations, we propose the Calibrated PLM (CALL), a combination of calibration techniques. The CALL complements the drawbacks that may occur when utilizing a calibration method individually and boosts both classification and calibration accuracy. Design choices in CALL's training procedures are extensively studied, and we provide a detailed analysis of how calibration techniques affect the calibration performance of PLMs.

Via

Access Paper or Ask Questions

A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Dec 31, 2021

Aigerim Bogyrbayeva, Taehyun Yoon, Hanbum Ko, Sungbin Lim, Hyokun Yun, Changhyun Kwon

Figure 1 for A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Figure 2 for A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Figure 3 for A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Figure 4 for A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Abstract:Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, they perform poorly for the TSP with Drone (TSP-D), requiring routing a heterogeneous fleet of vehicles in coordination -- a truck and a drone. In TSP-D, the two vehicles are moving in tandem and may need to wait at a node for the other vehicle to join. State-less attention-based decoder fails to make such coordination between vehicles. We propose an attention encoder-LSTM decoder hybrid model, in which the decoder's hidden state can represent the sequence of actions made. We empirically demonstrate that such a hybrid model improves upon a purely attention-based model for both solution quality and computational efficiency. Our experiments on the min-max Capacitated Vehicle Routing Problem (mmCVRP) also confirm that the hybrid model is more suitable for coordinated routing of multiple vehicles than the attention-based model.

Via

Access Paper or Ask Questions

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Oct 24, 2020

Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

Figure 1 for Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Figure 2 for Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Figure 3 for Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Figure 4 for Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Abstract:In this paper, we consider stochastic multi-armed bandits (MABs) with heavy-tailed rewards, whose $p$-th moment is bounded by a constant $\nu_{p}$ for $1<p\leq2$. First, we propose a novel robust estimator which does not require $\nu_{p}$ as prior information, while other existing robust estimators demand prior knowledge about $\nu_{p}$. We show that an error probability of the proposed estimator decays exponentially fast. Using this estimator, we propose a perturbation-based exploration strategy and develop a generalized regret analysis scheme that provides upper and lower regret bounds by revealing the relationship between the regret and the cumulative density function of the perturbation. From the proposed analysis scheme, we obtain gap-dependent and gap-independent upper and lower regret bounds of various perturbations. We also find the optimal hyperparameters for each perturbation, which can achieve the minimax optimal regret bound with respect to total rounds. In simulation, the proposed estimator shows favorable performance compared to existing robust estimators for various $p$ values and, for MAB problems, the proposed perturbation strategy outperforms existing exploration methods.

* 38 pages, 4 figures

Via

Access Paper or Ask Questions

Neural Bootstrapper

Oct 02, 2020

Minsuk Shin, Hyungjoo Cho, Sungbin Lim

Abstract:Bootstrapping has been a primary tool for uncertainty quantification, and their theoretical and computational properties have been investigated in the field of statistics and machine learning. However, due to its nature of repetitive computations, the computational burden required to implement bootstrap procedures for the neural network is painfully heavy, and this fact seriously hurdles the practical use of these procedures on the uncertainty estimation of modern deep learning. To overcome the inconvenience, we propose a procedure called \emph{Neural Bootstrapper} (NeuBoots). We reveal that the NeuBoots stably generate valid bootstrap samples that coincide with the desired target samples with minimal extra computational cost compared to traditional bootstrapping. Consequently, NeuBoots makes it feasible to construct bootstrap confidence intervals of outputs of neural networks and quantify their predictive uncertainty. We also suggest NeuBoots for deep convolutional neural networks to consider its utility in image classification tasks, including calibration, detection of out-of-distribution samples, and active learning. Empirical results demonstrate that NeuBoots is significantly beneficial for the above purposes.

* 16 pages, 16 figures

Via

Access Paper or Ask Questions