Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiwon Kim

MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models

Apr 11, 2025

Junmo Kim, Namkyeong Lee, Jiwon Kim, Kwangsoo Kim

Abstract:Electronic health record (EHR) foundation models have been an area ripe for exploration with their improved performance in various medical tasks. Despite the rapid advances, there exists a fundamental limitation: Processing unseen medical codes out of the vocabulary. This problem limits the generality of EHR foundation models and the integration of models trained with different vocabularies. To deal with this problem, we propose MedRep for EHR foundation models based on the observational medical outcome partnership (OMOP) common data model (CDM), providing the integrated medical concept representations and the basic data augmentation strategy for patient trajectories. For concept representation learning, we enrich the information of each concept with a minimal definition through large language model (LLM) prompts and enhance the text-based representations through graph ontology of OMOP vocabulary. Trajectory augmentation randomly replaces selected concepts with other similar concepts that have closely related representations to let the model practice with the concepts out-of-vocabulary. Finally, we demonstrate that EHR foundation models trained with MedRep better maintain the prediction performance in external datasets. Our code implementation is publicly available at https://github.com/kicarussays/MedRep.

* Under review

Via

Access Paper or Ask Questions

Identity-preserving Distillation Sampling by Fixed-Point Iterator

Feb 27, 2025

SeonHwa Kim, Jiwon Kim, Soobin Park, Donghoon Ahn, Jiwon Kang, Seungryong Kim, Kyong Hwan Jin, Eunju Cha

Abstract:Score distillation sampling (SDS) demonstrates a powerful capability for text-conditioned 2D image and 3D object generation by distilling the knowledge from learned score functions. However, SDS often suffers from blurriness caused by noisy gradients. When SDS meets the image editing, such degradations can be reduced by adjusting bias shifts using reference pairs, but the de-biasing techniques are still corrupted by erroneous gradients. To this end, we introduce Identity-preserving Distillation Sampling (IDS), which compensates for the gradient leading to undesired changes in the results. Based on the analysis that these errors come from the text-conditioned scores, a new regularization technique, called fixed-point iterative regularization (FPR), is proposed to modify the score itself, driving the preservation of the identity even including poses and structures. Thanks to a self-correction by FPR, the proposed method provides clear and unambiguous representations corresponding to the given prompts in image-to-image editing and editable neural radiance field (NeRF). The structural consistency between the source and the edited data is obviously maintained compared to other state-of-the-art methods.

Via

Access Paper or Ask Questions

Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Jan 06, 2025

Hanbin Bae, Byungjun Kang, Jiwon Kim, Jaeyong Hwang, Hosang Sung, Hoon-Young Cho

Figure 1 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Figure 2 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Figure 3 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Figure 4 for Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Abstract:This study emphasizes the significance of exploring distance-based source separation (DSS) in outdoor environments. Unlike existing studies that primarily focus on indoor settings, the proposed model is designed to capture the unique characteristics of outdoor audio sources. It incorporates advanced techniques, including a two-stage conformer block, a linear relation-aware self-attention (RSA), and a TensorFlow Lite GPU delegate. While the linear RSA may not capture physical cues as explicitly as the quadratic RSA, the linear RSA enhances the model's context awareness, leading to improved performance on the DSS that requires an understanding of physical cues in outdoor and indoor environments. The experimental results demonstrated that the proposed model overcomes the limitations of existing approaches and considerably enhances energy efficiency and real-time inference speed on mobile devices.

* Accepted by ICASSP2025. \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component

Via

Access Paper or Ask Questions

Context-Aware LLM Translation System Using Conversation Summarization and Dialogue History

Oct 22, 2024

Mingi Sung, Seungmin Lee, Jiwon Kim, Sejoon Kim

Figure 1 for Context-Aware LLM Translation System Using Conversation Summarization and Dialogue History

Figure 2 for Context-Aware LLM Translation System Using Conversation Summarization and Dialogue History

Figure 3 for Context-Aware LLM Translation System Using Conversation Summarization and Dialogue History

Figure 4 for Context-Aware LLM Translation System Using Conversation Summarization and Dialogue History

Abstract:Translating conversational text, particularly in customer support contexts, presents unique challenges due to its informal and unstructured nature. We propose a context-aware LLM translation system that leverages conversation summarization and dialogue history to enhance translation quality for the English-Korean language pair. Our approach incorporates the two most recent dialogues as raw data and a summary of earlier conversations to manage context length effectively. We demonstrate that this method significantly improves translation accuracy, maintaining coherence and consistency across conversations. This system offers a practical solution for customer support translation tasks, addressing the complexities of conversational text.

* Accepted to WMT 2024

Via

Access Paper or Ask Questions

A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research

Oct 09, 2024

Seongjin Choi, Zhixiong Jin, Seungwoo Ham, Jiwon Kim, Lijun Sun

Abstract:Deep Generative Models (DGMs) have rapidly advanced in recent years, becoming essential tools in various fields due to their ability to learn complex data distributions and generate synthetic data. Their importance in transportation research is increasingly recognized, particularly for applications like traffic data generation, prediction, and feature extraction. This paper offers a comprehensive introduction and tutorial on DGMs, with a focus on their applications in transportation. It begins with an overview of generative models, followed by detailed explanations of fundamental models, a systematic review of the literature, and practical tutorial code to aid implementation. The paper also discusses current challenges and opportunities, highlighting how these models can be effectively utilized and further developed in transportation research. This paper serves as a valuable reference, guiding researchers and practitioners from foundational knowledge to advanced applications of DGMs in transportation research.

* 64 pages, 21 figures, 4 tables

Via

Access Paper or Ask Questions

Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices

Oct 02, 2024

Jeho Lee, Chanyoung Jung, Jiwon Kim, Hojung Cha

Abstract:3D object detection with omnidirectional views enables safety-critical applications such as mobile robot navigation. Such applications increasingly operate on resource-constrained edge devices, facilitating reliable processing without privacy concerns or network delays. To enable cost-effective deployment, cameras have been widely adopted as a low-cost alternative to LiDAR sensors. However, the compute-intensive workload to achieve high performance of camera-based solutions remains challenging due to the computational limitations of edge devices. In this paper, we present Panopticus, a carefully designed system for omnidirectional and camera-based 3D detection on edge devices. Panopticus employs an adaptive multi-branch detection scheme that accounts for spatial complexities. To optimize the accuracy within latency limits, Panopticus dynamically adjusts the model's architecture and operations based on available edge resources and spatial characteristics. We implemented Panopticus on three edge devices and conducted experiments across real-world environments based on the public self-driving dataset and our mobile 360{\deg} camera dataset. Experiment results showed that Panopticus improves accuracy by 62% on average given the strict latency objective of 33ms. Also, Panopticus achieves a 2.1{\times} latency reduction on average compared to baselines.

* Published at MobiCom 2024

Via

Access Paper or Ask Questions

Blind-Match: Efficient Homomorphic Encryption-Based 1:N Matching for Privacy-Preserving Biometric Identification

Aug 12, 2024

Hyunmin Choi, Jiwon Kim, Chiyoung Song, Simon S. Woo, Hyoungshick Kim

Abstract:We present Blind-Match, a novel biometric identification system that leverages homomorphic encryption (HE) for efficient and privacy-preserving 1:N matching. Blind-Match introduces a HE-optimized cosine similarity computation method, where the key idea is to divide the feature vector into smaller parts for processing rather than computing the entire vector at once. By optimizing the number of these parts, Blind-Match minimizes execution time while ensuring data privacy through HE. Blind-Match achieves superior performance compared to state-of-the-art methods across various biometric datasets. On the LFW face dataset, Blind-Match attains a 99.63% Rank-1 accuracy with a 128-dimensional feature vector, demonstrating its robustness in face recognition tasks. For fingerprint identification, Blind-Match achieves a remarkable 99.55% Rank-1 accuracy on the PolyU dataset, even with a compact 16-dimensional feature vector, significantly outperforming the state-of-the-art method, Blind-Touch, which achieves only 59.17%. Furthermore, Blind-Match showcases practical efficiency in large-scale biometric identification scenarios, such as Naver Cloud's FaceSign, by processing 6,144 biometric samples in 0.74 seconds using a 128-dimensional feature vector.

* Accepted to CIKM 2024 (Applied Research Track)

Via

Access Paper or Ask Questions

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Mar 14, 2024

Dinh Phat Do, Taehoon Kim, Jaemin Na, Jiwon Kim, Keonho Lee, Kyunghwan Cho, Wonjun Hwang

Figure 1 for D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Abstract:Domain adaptation for object detection typically entails transferring knowledge from one visible domain to another visible domain. However, there are limited studies on adapting from the visible to the thermal domain, because the domain gap between the visible and thermal domains is much larger than expected, and traditional domain adaptation can not successfully facilitate learning in this situation. To overcome this challenge, we propose a Distinctive Dual-Domain Teacher (D3T) framework that employs distinct training paradigms for each domain. Specifically, we segregate the source and target training sets for building dual-teachers and successively deploy exponential moving average to the student model to individual teachers of each domain. The framework further incorporates a zigzag learning method between dual teachers, facilitating a gradual transition from the visible to thermal domains during training. We validate the superiority of our method through newly designed experimental protocols with well-known thermal datasets, i.e., FLIR and KAIST. Source code is available at https://github.com/EdwardDo69/D3T .

* Accepted by CVPR 2024. Link: https://github.com/EdwardDo69/D3T

Via

Access Paper or Ask Questions

PANDAS: Prototype-based Novel Class Discovery and Detection

Feb 27, 2024

Tyler L. Hayes, César R. de Souza, Namil Kim, Jiwon Kim, Riccardo Volpi, Diane Larlus

Figure 1 for PANDAS: Prototype-based Novel Class Discovery and Detection

Figure 2 for PANDAS: Prototype-based Novel Class Discovery and Detection

Figure 3 for PANDAS: Prototype-based Novel Class Discovery and Detection

Figure 4 for PANDAS: Prototype-based Novel Class Discovery and Detection

Abstract:Object detectors are typically trained once and for all on a fixed set of classes. However, this closed-world assumption is unrealistic in practice, as new classes will inevitably emerge after the detector is deployed in the wild. In this work, we look at ways to extend a detector trained for a set of base classes so it can i) spot the presence of novel classes, and ii) automatically enrich its repertoire to be able to detect those newly discovered classes together with the base ones. We propose PANDAS, a method for novel class discovery and detection. It discovers clusters representing novel classes from unlabeled data, and represents old and new classes with prototypes. During inference, a distance-based classifier uses these prototypes to assign a label to each detected object instance. The simplicity of our method makes it widely applicable. We experimentally demonstrate the effectiveness of PANDAS on the VOC 2012 and COCO-to-LVIS benchmarks. It performs favorably against the state of the art for this task while being computationally more affordable.

Via

Access Paper or Ask Questions

SoK: Facial Deepfake Detectors

Jan 09, 2024

Binh M. Le, Jiwon Kim, Shahroz Tariq, Kristen Moore, Alsharif Abuadbba, Simon S. Woo

Figure 1 for SoK: Facial Deepfake Detectors

Figure 2 for SoK: Facial Deepfake Detectors

Figure 3 for SoK: Facial Deepfake Detectors

Figure 4 for SoK: Facial Deepfake Detectors

Abstract:Deepfakes have rapidly emerged as a profound and serious threat to society, primarily due to their ease of creation and dissemination. This situation has triggered an accelerated development of deepfake detection technologies. However, many existing detectors rely heavily on lab-generated datasets for validation, which may not effectively prepare them for novel, emerging, and real-world deepfake techniques. In this paper, we conduct an extensive and comprehensive review and analysis of the latest state-of-the-art deepfake detectors, evaluating them against several critical criteria. These criteria facilitate the categorization of these detectors into 4 high-level groups and 13 fine-grained sub-groups, all aligned with a unified standard conceptual framework. This classification and framework offer deep and practical insights into the factors that affect detector efficacy. We assess the generalizability of 16 leading detectors across various standard attack scenarios, including black-box, white-box, and gray-box settings. Our systematized analysis and experimentation lay the groundwork for a deeper understanding of deepfake detectors and their generalizability, paving the way for future research focused on creating detectors adept at countering various attack scenarios. Additionally, this work offers insights for developing more proactive defenses against deepfakes.

* 18 pages, 6 figures, 5 table, under peer-review

Via

Access Paper or Ask Questions