Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kyu-Hwan Jung

Doppelgänger Method: Breaking Role Consistency in LLM Agent via Prompt-based Transferable Adversarial Attack

Jun 17, 2025

Daewon Kang, YeongHwan Shin, Doyeon Kim, Kyu-Hwan Jung, Meong Hi Son

Abstract:Since the advent of large language models, prompt engineering now enables the rapid, low-effort creation of diverse autonomous agents that are already in widespread use. Yet this convenience raises urgent concerns about the safety, robustness, and behavioral consistency of the underlying prompts, along with the pressing challenge of preventing those prompts from being exposed to user's attempts. In this paper, we propose the ''Doppelg\"anger method'' to demonstrate the risk of an agent being hijacked, thereby exposing system instructions and internal information. Next, we define the ''Prompt Alignment Collapse under Adversarial Transfer (PACAT)'' level to evaluate the vulnerability to this adversarial transfer attack. We also propose a ''Caution for Adversarial Transfer (CAT)'' prompt to counter the Doppelg\"anger method. The experimental results demonstrate that the Doppelg\"anger method can compromise the agent's consistency and expose its internal information. In contrast, CAT prompts enable effective defense against this adversarial attack.

Via

Access Paper or Ask Questions

Relieving the Plateau: Active Semi-Supervised Learning for a Better Landscape

Apr 08, 2021

Seo Taek Kong, Soomin Jeon, Jaewon Lee, Hongseok Lee, Kyu-Hwan Jung

Figure 1 for Relieving the Plateau: Active Semi-Supervised Learning for a Better Landscape

Figure 2 for Relieving the Plateau: Active Semi-Supervised Learning for a Better Landscape

Figure 3 for Relieving the Plateau: Active Semi-Supervised Learning for a Better Landscape

Figure 4 for Relieving the Plateau: Active Semi-Supervised Learning for a Better Landscape

Abstract:Deep learning (DL) relies on massive amounts of labeled data, and improving its labeled sample-efficiency remains one of the most important problems since its advent. Semi-supervised learning (SSL) leverages unlabeled data that are more accessible than their labeled counterparts. Active learning (AL) selects unlabeled instances to be annotated by a human-in-the-loop in hopes of better performance with less labeled data. Given the accessible pool of unlabeled data in pool-based AL, it seems natural to use SSL when training and AL to update the labeled set; however, algorithms designed for their combination remain limited. In this work, we first prove that convergence of gradient descent on sufficiently wide ReLU networks can be expressed in terms of their Gram matrix' eigen-spectrum. Equipped with a few theoretical insights, we propose convergence rate control (CRC), an AL algorithm that selects unlabeled data to improve the problem conditioning upon inclusion to the labeled set, by formulating an acquisition step in terms of improving training dynamics. Extensive experiments show that SSL algorithms coupled with CRC can achieve high performance using very few labeled data.

Via

Access Paper or Ask Questions

Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

Sep 02, 2019

Woong Bae, Seungho Lee, Yeha Lee, Beomhee Park, Minki Chung, Kyu-Hwan Jung

Figure 1 for Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

Figure 2 for Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

Figure 3 for Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

Figure 4 for Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

Abstract:Neural Architecture Search (NAS), a framework which automates the task of designing neural networks, has recently been actively studied in the field of deep learning. However, there are only a few NAS methods suitable for 3D medical image segmentation. Medical 3D images are generally very large; thus it is difficult to apply previous NAS methods due to their GPU computational burden and long training time. We propose the resource-optimized neural architecture search method which can be applied to 3D medical segmentation tasks in a short training time (1.39 days for 1GB dataset) using a small amount of computation power (one RTX 2080Ti, 10.8GB GPU memory). Excellent performance can also be achieved without retraining(fine-tuning) which is essential in most NAS methods. These advantages can be achieved by using a reinforcement learning-based controller with parameter sharing and focusing on the optimal search space configuration of macro search rather than micro search. Our experiments demonstrate that the proposed NAS method outperforms manually designed networks with state-of-the-art performance in 3D medical image segmentation.

* MICCAI(International Conference on Medical Image Computing and Computer Assisted Intervention) 2019 accepted

Via

Access Paper or Ask Questions

Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays

Nov 21, 2018

Sejin Park, Woochan Hwang, Kyu-Hwan Jung

Figure 1 for Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays

Figure 2 for Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays

Figure 3 for Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays

Abstract:Machine learning applications in medical imaging are frequently limited by the lack of quality labeled data. In this paper, we explore the self training method, a form of semi-supervised learning, to address the labeling burden. By integrating reinforcement learning, we were able to expand the application of self training to complex segmentation networks without any further human annotation. The proposed approach, reinforced self training (ReST), fine tunes a semantic segmentation networks by introducing a policy network that learns to generate pseudolabels. We incorporate an expert demonstration network, based on inverse reinforcement learning, to enhance clinical validity and convergence of the policy network. The model was tested on a pulmonary nodule segmentation task in chest X-rays and achieved the performance of a standard U-Net while using only 50% of the labeled data, by exploiting unlabeled data. When the same number of labeled data was used, a moderate to significant cross validation accuracy improvement was achieved depending on the absolute number of labels used.

* Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

Via

Access Paper or Ask Questions

Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN

Nov 02, 2018

Jaemin Son, Woong Bae, Sangkeun Kim, Sang Jun Park, Kyu-Hwan Jung

Figure 1 for Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN

Figure 2 for Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN

Figure 3 for Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN

Figure 4 for Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN

Abstract:Fundoscopic images are often investigated by ophthalmologists to spot abnormal lesions to make diagnoses. Recent successes of convolutional neural networks are confined to diagnoses of few diseases without proper localization of lesion. In this paper, we propose an efficient annotation method for localizing lesions and a CNN architecture that can classify an individual finding and localize the lesions at the same time. Also, we introduce a new loss function to guide the network to learn meaningful patterns with the guidance of the regional annotations. In experiments, we demonstrate that our network performed better than the widely used network and the guidance loss helps achieve higher AUROC up to 4.1% and superior localization capability.

* 8 pages, Computational Pathology and Ophthalmic Medical Image Analysis, pp.176-184

Via

Access Paper or Ask Questions

Retinal Vessel Segmentation in Fundoscopic Images with Generative Adversarial Networks

Jun 28, 2017

Jaemin Son, Sang Jun Park, Kyu-Hwan Jung

Figure 1 for Retinal Vessel Segmentation in Fundoscopic Images with Generative Adversarial Networks

Figure 2 for Retinal Vessel Segmentation in Fundoscopic Images with Generative Adversarial Networks

Figure 3 for Retinal Vessel Segmentation in Fundoscopic Images with Generative Adversarial Networks

Figure 4 for Retinal Vessel Segmentation in Fundoscopic Images with Generative Adversarial Networks

Abstract:Retinal vessel segmentation is an indispensable step for automatic detection of retinal diseases with fundoscopic images. Though many approaches have been proposed, existing methods tend to miss fine vessels or allow false positives at terminal branches. Let alone under-segmentation, over-segmentation is also problematic when quantitative studies need to measure the precise width of vessels. In this paper, we present a method that generates the precise map of retinal vessels using generative adversarial training. Our methods achieve dice coefficient of 0.829 on DRIVE dataset and 0.834 on STARE dataset which is the state-of-the-art performance on both datasets.

* 9 pages, submitted to DLMIA 2017

Via

Access Paper or Ask Questions