Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyunseung Chung

PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions

May 23, 2025

Daeun Kyung, Hyunseung Chung, Seongsu Bae, Jiho Kim, Jae Ho Sohn, Taerim Kim, Soo Kyung Kim, Edward Choi

Abstract:Doctor-patient consultations require multi-turn, context-aware communication tailored to diverse patient personas. Training or evaluating doctor LLMs in such settings requires realistic patient interaction systems. However, existing simulators often fail to reflect the full range of personas seen in clinical practice. To address this, we introduce PatientSim, a patient simulator that generates realistic and diverse patient personas for clinical scenarios, grounded in medical expertise. PatientSim operates using: 1) clinical profiles, including symptoms and medical history, derived from real-world data in the MIMIC-ED and MIMIC-IV datasets, and 2) personas defined by four axes: personality, language proficiency, medical history recall level, and cognitive confusion level, resulting in 37 unique combinations. We evaluated eight LLMs for factual accuracy and persona consistency. The top-performing open-source model, Llama 3.3, was validated by four clinicians to confirm the robustness of our framework. As an open-source, customizable platform, PatientSim provides a reproducible and scalable solution that can be customized for specific training needs. Offering a privacy-compliant environment, it serves as a robust testbed for evaluating medical dialogue systems across diverse patient presentations and shows promise as an educational tool for healthcare.

* 9 pages for main text, 4 pages for references, 27 pages for supplementary materials

Via

Access Paper or Ask Questions

Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models

Aug 07, 2024

Hyunseung Chung, Sumin Jo, Yeonsu Kwon, Edward Choi

Figure 1 for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models

Figure 2 for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models

Figure 3 for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models

Figure 4 for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models

Abstract:Despite the massive attention given to time-series explanations due to their extensive applications, a notable limitation in existing approaches is their primary reliance on the time-domain. This overlooks the inherent characteristic of time-series data containing both time and frequency features. In this work, we present Spectral eXplanation (SpectralX), an XAI framework that provides time-frequency explanations for time-series black-box classifiers. This easily adaptable framework enables users to "plug-in" various perturbation-based XAI methods for any pre-trained time-series classification models to assess their impact on the explanation quality without having to modify the framework architecture. Additionally, we introduce Feature Importance Approximations (FIA), a new perturbation-based XAI method. These methods consist of feature insertion, deletion, and combination techniques to enhance computational efficiency and class-specific explanations in time-series classification tasks. We conduct extensive experiments in the generated synthetic dataset and various UCR Time-Series datasets to first compare the explanation performance of FIA and other existing perturbation-based XAI methods in both time-domain and time-frequency domain, and then show the superiority of our FIA in the time-frequency domain with the SpectralX framework. Finally, we conduct a user study to confirm the practicality of our FIA in SpectralX framework for class-specific time-frequency based time-series explanations. The source code is available in https://github.com/gustmd0121/Time_is_not_Enough

* Accepted to CIKM 2024 (10 pages, 4 figures, 6 tables)

Via

Access Paper or Ask Questions

DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

Jun 19, 2024

Jiho Kim, Woosog Chay, Hyeonji Hwang, Daeun Kyung, Hyunseung Chung, Eunbyeol Cho, Yohan Jo, Edward Choi

Figure 1 for DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

Figure 2 for DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

Figure 3 for DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

Figure 4 for DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

Abstract:Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of conversational agents, making them applicable to various fields (e.g., education). Despite their progress, the evaluation of the agents often overlooks the complexities of real-world conversations, such as real-time interactions, multi-party dialogues, and extended contextual dependencies. To bridge this gap, we introduce DialSim, a real-time dialogue simulator. In this simulator, an agent is assigned the role of a character from popular TV shows, requiring it to respond to spontaneous questions using past dialogue information and to distinguish between known and unknown information. Key features of DialSim include evaluating the agent's ability to respond within a reasonable time limit, handling long-term multi-party dialogues, and managing adversarial settings (e.g., swap character names) to challenge the agent's reliance on pre-trained knowledge. We utilized this simulator to evaluate the latest conversational agents and analyze their limitations. Our experiments highlight both the strengths and weaknesses of these agents, providing valuable insights for future improvements in the field of conversational AI. DialSim is available at https://github.com/jiho283/Simulator.

Via

Access Paper or Ask Questions

Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports

Mar 09, 2023

Hyunseung Chung, Jiho Kim, Joon-myoung Kwon, Ki-Hyun Jeon, Min Sung Lee, Edward Choi

Figure 1 for Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports

Figure 2 for Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports

Figure 3 for Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports

Figure 4 for Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports

Abstract:Electrocardiogram (ECG) synthesis is the area of research focused on generating realistic synthetic ECG signals for medical use without concerns over annotation costs or clinical data privacy restrictions. Traditional ECG generation models consider a single ECG lead and utilize GAN-based generative models. These models can only generate single lead samples and require separate training for each diagnosis class. The diagnosis classes of ECGs are insufficient to capture the intricate differences between ECGs depending on various features (e.g. patient demographic details, co-existing diagnosis classes, etc.). To alleviate these challenges, we present a text-to-ECG task, in which textual inputs are used to produce ECG outputs. Then we propose Auto-TTE, an autoregressive generative model conditioned on clinical text reports to synthesize 12-lead ECGs, for the first time to our knowledge. We compare the performance of our model with other representative models in text-to-speech and text-to-image. Experimental results show the superiority of our model in various quantitative evaluations and qualitative analysis. Finally, we conduct a user study with three board-certified cardiologists to confirm the fidelity and semantic alignment of generated samples. our code will be available at https://github.com/TClife/text_to_ecg

* Accepted to ICASSP 2023 (5 pages, 3 figures, 4 tables)

Via

Access Paper or Ask Questions

Lead-agnostic Self-supervised Learning for Local and Global Representations of Electrocardiogram

Mar 18, 2022

Jungwoo Oh, Hyunseung Chung, Joon-myoung Kwon, Dong-gyun Hong, Edward Choi

Figure 1 for Lead-agnostic Self-supervised Learning for Local and Global Representations of Electrocardiogram

Figure 2 for Lead-agnostic Self-supervised Learning for Local and Global Representations of Electrocardiogram

Figure 3 for Lead-agnostic Self-supervised Learning for Local and Global Representations of Electrocardiogram

Figure 4 for Lead-agnostic Self-supervised Learning for Local and Global Representations of Electrocardiogram

Abstract:In recent years, self-supervised learning methods have shown significant improvement for pre-training with unlabeled data and have proven helpful for electrocardiogram signals. However, most previous pre-training methods for electrocardiogram focused on capturing only global contextual representations. This inhibits the models from learning fruitful representation of electrocardiogram, which results in poor performance on downstream tasks. Additionally, they cannot fine-tune the model with an arbitrary set of electrocardiogram leads unless the models were pre-trained on the same set of leads. In this work, we propose an ECG pre-training method that learns both local and global contextual representations for better generalizability and performance on downstream tasks. In addition, we propose random lead masking as an ECG-specific augmentation method to make our proposed model robust to an arbitrary set of leads. Experimental results on two downstream tasks, cardiac arrhythmia classification and patient identification, show that our proposed approach outperforms other state-of-the-art methods.

* Accepted at CHIL 2022 (16 pages, 3 figures, 4 tables)

Via

Access Paper or Ask Questions

Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

Jun 05, 2021

Hyunseung Chung, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

Figure 2 for Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

Figure 3 for Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

Figure 4 for Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

Abstract:Text-to-speech (TTS) synthesis is the process of producing synthesized speech from text or phoneme input. Traditional TTS models contain multiple processing steps and require external aligners, which provide attention alignments of phoneme-to-frame sequences. As the complexity increases and efficiency decreases with every additional step, there is expanding demand in modern synthesis pipelines for end-to-end TTS with efficient internal aligners. In this work, we propose an end-to-end text-to-waveform network with a novel reinforcement learning based duration search method. Our proposed generator is feed-forward and the aligner trains the agent to make optimal duration predictions by receiving active feedback from actions taken to maximize cumulative reward. We demonstrate accurate alignments of phoneme-to-frame sequence generated from trained agents enhance fidelity and naturalness of synthesized audio. Experimental results also show the superiority of our proposed model compared to other state-of-the-art TTS models with internal and external aligners.

* Accepted in INTERSPEECH 2021

Via

Access Paper or Ask Questions

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Oct 19, 2020

Hyunseung Chung, Woo-Jeoung Nam, Seong-Whan Lee

Figure 1 for Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Figure 2 for Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Figure 3 for Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Figure 4 for Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Abstract:Remote sensing image retrieval (RSIR) is the process of ranking database images depending on the degree of similarity compared to the query image. As the complexity of RSIR increases due to the diversity in shooting range, angle, and location of remote sensors, there is an increasing demand for methods to address these issues and improve retrieval performance. In this work, we introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning, resulting in robustness to rotational variations. For refinement and emphasis on important features, we applied channel attention in each group convolution stage. By utilizing the characteristics of group convolution and channel-wise attention, it is possible to acknowledge the equality among rotated but identically located images. The training procedure has two main steps: (i) training the network with Aerial Image Dataset (AID) for classification, (ii) fine-tuning the network with triplet-loss for retrieval with Google Earth South Korea and NWPU-RESISC45 datasets. Results show that the proposed method performance exceeds other state-of-the-art retrieval methods in both rotated and original environments. Furthermore, we utilize class activation maps (CAM) to visualize the distinct difference of main features between our method and baseline, resulting in better adaptability in rotated environments.

* 8 pages, 5 figures, Accepted in ICPR 2020

Via

Access Paper or Ask Questions