Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sanghyun Lee

Fine-Grained Perturbation Guidance via Attention Head Selection

Jun 12, 2025

Donghoon Ahn, Jiwon Kang, Sanghyun Lee, Minjae Kim, Jaewon Min, Wooseok Jang, Saungwu Lee, Sayak Paul, Susung Hong, Seungryong Kim

Abstract:Recent guidance methods in diffusion models steer reverse sampling by perturbing the model to construct an implicit weak model and guide generation away from it. Among these approaches, attention perturbation has demonstrated strong empirical performance in unconditional scenarios where classifier-free guidance is not applicable. However, existing attention perturbation methods lack principled approaches for determining where perturbations should be applied, particularly in Diffusion Transformer (DiT) architectures where quality-relevant computations are distributed across layers. In this paper, we investigate the granularity of attention perturbations, ranging from the layer level down to individual attention heads, and discover that specific heads govern distinct visual concepts such as structure, style, and texture quality. Building on this insight, we propose "HeadHunter", a systematic framework for iteratively selecting attention heads that align with user-centric objectives, enabling fine-grained control over generation quality and visual attributes. In addition, we introduce SoftPAG, which linearly interpolates each selected head's attention map toward an identity matrix, providing a continuous knob to tune perturbation strength and suppress artifacts. Our approach not only mitigates the oversmoothing issues of existing layer-level perturbation but also enables targeted manipulation of specific visual styles through compositional head selection. We validate our method on modern large-scale DiT-based text-to-image models including Stable Diffusion 3 and FLUX.1, demonstrating superior performance in both general quality enhancement and style-specific guidance. Our work provides the first head-level analysis of attention perturbation in diffusion models, uncovering interpretable specialization within attention layers and enabling practical design of effective perturbation strategies.

* Project page: https://cvlab-kaist.github.io/HeadHunter/

Via

Access Paper or Ask Questions

A Noise is Worth Diffusion Guidance

Dec 05, 2024

Donghoon Ahn, Jiwon Kang, Sanghyun Lee, Jaewon Min, Minjae Kim, Wooseok Jang, Hyoungwon Cho, Sayak Paul, SeonHwa Kim, Eunju Cha(+2 more)

Figure 1 for A Noise is Worth Diffusion Guidance

Abstract:Diffusion models excel in generating high-quality images. However, current diffusion models struggle to produce reliable images without guidance methods, such as classifier-free guidance (CFG). Are guidance methods truly necessary? Observing that noise obtained via diffusion inversion can reconstruct high-quality images without guidance, we focus on the initial noise of the denoising pipeline. By mapping Gaussian noise to `guidance-free noise', we uncover that small low-magnitude low-frequency components significantly enhance the denoising process, removing the need for guidance and thus improving both inference throughput and memory. Expanding on this, we propose \ours, a novel method that replaces guidance methods with a single refinement of the initial noise. This refined noise enables high-quality image generation without guidance, within the same diffusion pipeline. Our noise-refining model leverages efficient noise-space learning, achieving rapid convergence and strong performance with just 50K text-image pairs. We validate its effectiveness across diverse metrics and analyze how refined noise can eliminate the need for guidance. See our project page: https://cvlab-kaist.github.io/NoiseRefine/.

* Project page: https://cvlab-kaist.github.io/NoiseRefine/

Via

Access Paper or Ask Questions

Continual Traffic Forecasting via Mixture of Experts

Jun 05, 2024

Sanghyun Lee, Chanyoung Park

Figure 1 for Continual Traffic Forecasting via Mixture of Experts

Figure 2 for Continual Traffic Forecasting via Mixture of Experts

Figure 3 for Continual Traffic Forecasting via Mixture of Experts

Figure 4 for Continual Traffic Forecasting via Mixture of Experts

Abstract:The real-world traffic networks undergo expansion through the installation of new sensors, implying that the traffic patterns continually evolve over time. Incrementally training a model on the newly added sensors would make the model forget the past knowledge, i.e., catastrophic forgetting, while retraining the model on the entire network to capture these changes is highly inefficient. To address these challenges, we propose a novel Traffic Forecasting Mixture of Experts (TFMoE) for traffic forecasting under evolving networks. The main idea is to segment the traffic flow into multiple homogeneous groups, and assign an expert model responsible for a specific group. This allows each expert model to concentrate on learning and adapting to a specific set of patterns, while minimizing interference between the experts during training, thereby preventing the dilution or replacement of prior knowledge, which is a major cause of catastrophic forgetting. Through extensive experiments on a real-world long-term streaming network dataset, PEMSD3-Stream, we demonstrate the effectiveness and efficiency of TFMoE. Our results showcase superior performance and resilience in the face of catastrophic forgetting, underscoring the effectiveness of our approach in dealing with continual learning for traffic flow forecasting in long-term streaming networks.

Via

Access Paper or Ask Questions

Temporal Graph Learning Recurrent Neural Network for Traffic Forecasting

Jun 04, 2024

Sanghyun Lee, Chanyoung Park

Abstract:Accurate traffic flow forecasting is a crucial research topic in transportation management. However, it is a challenging problem due to rapidly changing traffic conditions, high nonlinearity of traffic flow, and complex spatial and temporal correlations of road networks. Most existing studies either try to capture the spatial dependencies between roads using the same semantic graph over different time steps, or assume all sensors on the roads are equally likely to be connected regardless of the distance between them. However, we observe that the spatial dependencies between roads indeed change over time, and two distant roads are not likely to be helpful to each other when predicting the traffic flow, both of which limit the performance of existing studies. In this paper, we propose Temporal Graph Learning Recurrent Neural Network (TGLRN) to address these problems. More precisely, to effectively model the nature of time series, we leverage Recurrent Neural Networks (RNNs) to dynamically construct a graph at each time step, thereby capturing the time-evolving spatial dependencies between roads (i.e., microscopic view). Simultaneously, we provide the Adaptive Structure Information to the model, ensuring that close and consecutive sensors are considered to be more important for predicting the traffic flow (i.e., macroscopic view). Furthermore, to endow TGLRN with robustness, we introduce an edge sampling strategy when constructing the graph at each time step, which eventually leads to further improvements on the model performance. Experimental results on four commonly used real-world benchmark datasets show the effectiveness of TGLRN.

Via

Access Paper or Ask Questions

Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening

Nov 13, 2023

Jongseo Choi, Hyuntai Chin, Hyunwoo Park, Daehyeok Kwon, Sanghyun Lee, Doosan Baek

Figure 1 for Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening

Figure 2 for Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening

Figure 3 for Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening

Figure 4 for Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening

Abstract:B-spline-based trajectory optimization has been widely used in robot navigation, especially as quadrotor-like vehicles can easily enjoy the advantage of a B-spline curve (e.g. computational efficiency) with its convex hull property for trajectory optimization. Nevertheless, leveraging the B-splined-based optimization algorithm to generate a collision-free trajectory for autonomous vehicles is still challenging because their complex vehicle kinematics make it difficult to use the convex hull property. In this paper, we propose a novel trajectory optimization algorithm for autonomous vehicles that enables the advantage of a B-spline curve into a B-spline-based optimization algorithm by incorporating vehicle kinematics with two methods. An incremental path flattening is a new method that iteratively increases path curvature weight around vehicle collision points to find a collision-free path by reducing swept volume. A new swept volume estimation method can reduce over-approximation of the swept volume and make the vehicle pass through a narrow corridor without losing safety. Furthermore, a clamped B-spline curvature constraint, which can simplify a B-spline curvature constraint, is added with other feasibility constraints (e.g. longitudinal \& lateral velocity and acceleration) for the vehicle kinodynamic constraints. Our experimental results demonstrate that our method outperforms state-of-the-art baselines in various simulated environments and verifies its valid tracking performance with an autonomous vehicle in a real-world scenario.

* 14 pages, 21 figures, 4 tables, 3 algorithms

Via

Access Paper or Ask Questions

On the training and generalization of deep operator networks

Sep 02, 2023

Sanghyun Lee, Yeonjong Shin

Abstract:We present a novel training method for deep operator networks (DeepONets), one of the most popular neural network models for operators. DeepONets are constructed by two sub-networks, namely the branch and trunk networks. Typically, the two sub-networks are trained simultaneously, which amounts to solving a complex optimization problem in a high dimensional space. In addition, the nonconvex and nonlinear nature makes training very challenging. To tackle such a challenge, we propose a two-step training method that trains the trunk network first and then sequentially trains the branch network. The core mechanism is motivated by the divide-and-conquer paradigm and is the decomposition of the entire complex training task into two subtasks with reduced complexity. Therein the Gram-Schmidt orthonormalization process is introduced which significantly improves stability and generalization ability. On the theoretical side, we establish a generalization error estimate in terms of the number of training data, the width of DeepONets, and the number of input and output sensors. Numerical examples are presented to demonstrate the effectiveness of the two-step training method, including Darcy flow in heterogeneous porous media.

Via

Access Paper or Ask Questions