Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

June Suk Choi

Enhancing LLM Agent Safety via Causal Influence Prompting

Jul 01, 2025

Dongyoon Hahm, Woogyeol Jin, June Suk Choi, Sungsoo Ahn, Kimin Lee

Figure 1 for Enhancing LLM Agent Safety via Causal Influence Prompting

Figure 2 for Enhancing LLM Agent Safety via Causal Influence Prompting

Figure 3 for Enhancing LLM Agent Safety via Causal Influence Prompting

Figure 4 for Enhancing LLM Agent Safety via Causal Influence Prompting

Abstract:As autonomous agents powered by large language models (LLMs) continue to demonstrate potential across various assistive tasks, ensuring their safe and reliable behavior is crucial for preventing unintended consequences. In this work, we introduce CIP, a novel technique that leverages causal influence diagrams (CIDs) to identify and mitigate risks arising from agent decision-making. CIDs provide a structured representation of cause-and-effect relationships, enabling agents to anticipate harmful outcomes and make safer decisions. Our approach consists of three key steps: (1) initializing a CID based on task specifications to outline the decision-making process, (2) guiding agent interactions with the environment using the CID, and (3) iteratively refining the CID based on observed behaviors and outcomes. Experimental results demonstrate that our method effectively enhances safety in both code execution and mobile device control tasks.

* Accepted at ACL 2025 Findings, Source code: https://github.com/HahmDY/causal_influence_prompting.git

Via

Access Paper or Ask Questions

Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance

Jun 10, 2025

June Suk Choi, Kyungmin Lee, Sihyun Yu, Yisol Choi, Jinwoo Shin, Kimin Lee

Figure 1 for Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance

Figure 2 for Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance

Figure 3 for Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance

Figure 4 for Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance

Abstract:Recent text-to-video (T2V) models have demonstrated strong capabilities in producing high-quality, dynamic videos. To improve the visual controllability, recent works have considered fine-tuning pre-trained T2V models to support image-to-video (I2V) generation. However, such adaptation frequently suppresses motion dynamics of generated outputs, resulting in more static videos compared to their T2V counterparts. In this work, we analyze this phenomenon and identify that it stems from the premature exposure to high-frequency details in the input image, which biases the sampling process toward a shortcut trajectory that overfits to the static appearance of the reference image. To address this, we propose adaptive low-pass guidance (ALG), a simple fix to the I2V model sampling procedure to generate more dynamic videos without compromising per-frame image quality. Specifically, ALG adaptively modulates the frequency content of the conditioning image by applying low-pass filtering at the early stage of denoising. Extensive experiments demonstrate that ALG significantly improves the temporal dynamics of generated videos, while preserving image fidelity and text alignment. Especially, under VBench-I2V test suite, ALG achieves an average improvement of 36% in dynamic degree without a significant drop in video quality or image fidelity.

* Preprint. Under review. Project page available at http://choi403.github.io/ALG

Via

Access Paper or Ask Questions

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Mar 12, 2025

Sangwon Jang, June Suk Choi, Jaehyeong Jo, Kimin Lee, Sung Ju Hwang

Figure 1 for Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Figure 2 for Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Figure 3 for Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Figure 4 for Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Abstract:Text-to-image diffusion models have achieved remarkable success in generating high-quality contents from text prompts. However, their reliance on publicly available data and the growing trend of data sharing for fine-tuning make these models particularly vulnerable to data poisoning attacks. In this work, we introduce the Silent Branding Attack, a novel data poisoning method that manipulates text-to-image diffusion models to generate images containing specific brand logos or symbols without any text triggers. We find that when certain visual patterns are repeatedly in the training data, the model learns to reproduce them naturally in its outputs, even without prompt mentions. Leveraging this, we develop an automated data poisoning algorithm that unobtrusively injects logos into original images, ensuring they blend naturally and remain undetected. Models trained on this poisoned dataset generate images containing logos without degrading image quality or text alignment. We experimentally validate our silent branding attack across two realistic settings on large-scale high-quality image datasets and style personalization datasets, achieving high success rates even without a specific text trigger. Human evaluation and quantitative metrics including logo detection show that our method can stealthily embed logos.

* CVPR 2025. Project page: https://silent-branding.github.io/

Via

Access Paper or Ask Questions

DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models

Feb 19, 2025

Daewon Chae, June Suk Choi, Jinkyu Kim, Kimin Lee

Figure 1 for DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models

Figure 2 for DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models

Figure 3 for DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models

Figure 4 for DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models

Abstract:Fine-tuning text-to-image diffusion models to maximize rewards has proven effective for enhancing model performance. However, reward fine-tuning methods often suffer from slow convergence due to online sample generation. Therefore, obtaining diverse samples with strong reward signals is crucial for improving sample efficiency and overall performance. In this work, we introduce DiffExp, a simple yet effective exploration strategy for reward fine-tuning of text-to-image models. Our approach employs two key strategies: (a) dynamically adjusting the scale of classifier-free guidance to enhance sample diversity, and (b) randomly weighting phrases of the text prompt to exploit high-quality reward signals. We demonstrate that these strategies significantly enhance exploration during online sample generation, improving the sample efficiency of recent reward fine-tuning methods, such as DDPO and AlignProp.

* AAAI 2025

Via

Access Paper or Ask Questions

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Oct 23, 2024

Juyong Lee, Dongyoon Hahm, June Suk Choi, W. Bradley Knox, Kimin Lee

Figure 1 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Figure 2 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Figure 3 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Figure 4 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Abstract:Autonomous agents powered by large language models (LLMs) show promising potential in assistive tasks across various domains, including mobile device control. As these agents interact directly with personal information and device settings, ensuring their safe and reliable behavior is crucial to prevent undesirable outcomes. However, no benchmark exists for standardized evaluation of the safety of mobile device-control agents. In this work, we introduce MobileSafetyBench, a benchmark designed to evaluate the safety of device-control agents within a realistic mobile environment based on Android emulators. We develop a diverse set of tasks involving interactions with various mobile applications, including messaging and banking applications. To clearly evaluate safety apart from general capabilities, we design separate tasks measuring safety and tasks evaluating helpfulness. The safety tasks challenge agents with managing potential risks prevalent in daily life and include tests to evaluate robustness against indirect prompt injections. Our experiments demonstrate that while baseline agents, based on state-of-the-art LLMs, perform well in executing helpful tasks, they show poor performance in safety tasks. To mitigate these safety concerns, we propose a prompting method that encourages agents to prioritize safety considerations. While this method shows promise in promoting safer behaviors, there is still considerable room for improvement to fully earn user trust. This highlights the urgent need for continued research to develop more robust safety mechanisms in mobile environments. We open-source our benchmark at: https://mobilesafetybench.github.io/.

Via

Access Paper or Ask Questions

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

Oct 08, 2024

June Suk Choi, Kyungmin Lee, Jongheon Jeong, Saining Xie, Jinwoo Shin, Kimin Lee

Abstract:Recent advances in diffusion models have introduced a new era of text-guided image manipulation, enabling users to create realistic edited images with simple textual prompts. However, there is significant concern about the potential misuse of these methods, especially in creating misleading or harmful content. Although recent defense strategies, which introduce imperceptible adversarial noise to induce model failure, have shown promise, they remain ineffective against more sophisticated manipulations, such as editing with a mask. In this work, we propose DiffusionGuard, a robust and effective defense method against unauthorized edits by diffusion-based image editing models, even in challenging setups. Through a detailed analysis of these models, we introduce a novel objective that generates adversarial noise targeting the early stage of the diffusion process. This approach significantly improves the efficiency and effectiveness of adversarial noises. We also introduce a mask-augmentation technique to enhance robustness against various masks during test time. Finally, we introduce a comprehensive benchmark designed to evaluate the effectiveness and robustness of methods in protecting against privacy threats in realistic scenarios. Through extensive experiments, we show that our method achieves stronger protection and improved mask robustness with lower computational costs compared to the strongest baseline. Additionally, our method exhibits superior transferability and better resilience to noise removal techniques compared to all baseline methods. Our source code is publicly available at https://github.com/choi403/DiffusionGuard.

* Preprint. Under review

Via

Access Paper or Ask Questions

Collaborative Score Distillation for Consistent Visual Synthesis

Jul 04, 2023

Subin Kim, Kyungmin Lee, June Suk Choi, Jongheon Jeong, Kihyuk Sohn, Jinwoo Shin

Abstract:Generative priors of large-scale text-to-image diffusion models enable a wide range of new generation and editing applications on diverse visual modalities. However, when adapting these priors to complex visual modalities, often represented as multiple images (e.g., video), achieving consistency across a set of images is challenging. In this paper, we address this challenge with a novel method, Collaborative Score Distillation (CSD). CSD is based on the Stein Variational Gradient Descent (SVGD). Specifically, we propose to consider multiple samples as "particles" in the SVGD update and combine their score functions to distill generative priors over a set of images synchronously. Thus, CSD facilitates seamless integration of information across 2D images, leading to a consistent visual synthesis across multiple samples. We show the effectiveness of CSD in a variety of tasks, encompassing the visual editing of panorama images, videos, and 3D scenes. Our results underline the competency of CSD as a versatile method for enhancing inter-sample consistency, thereby broadening the applicability of text-to-image diffusion models.

* Project page with visuals: https://subin-kim-cv.github.io/CSD/

Via

Access Paper or Ask Questions