Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haejun Chung

EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting

Aug 29, 2025

Yujin Park, Haejun Chung, Ikbeom Jang

Figure 1 for EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting

Figure 2 for EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting

Figure 3 for EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting

Figure 4 for EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting

Abstract:Pairwise comparison is often favored over absolute rating or ordinal classification in subjective or difficult annotation tasks due to its improved reliability. However, exhaustive comparisons require a massive number of annotations (O(n^2)). Recent work has greatly reduced the annotation burden (O(n log n)) by actively sampling pairwise comparisons using a sorting algorithm. We further improve annotation efficiency by (1) roughly pre-ordering items using the Contrastive Language-Image Pre-training (CLIP) model hierarchically without training, and (2) replacing easy, obvious human comparisons with automated comparisons. The proposed EZ-Sort first produces a CLIP-based zero-shot pre-ordering, then initializes bucket-aware Elo scores, and finally runs an uncertainty-guided human-in-the-loop MergeSort. Validation was conducted using various datasets: face-age estimation (FGNET), historical image chronology (DHCI), and retinal image quality assessment (EyePACS). It showed that EZ-Sort reduced human annotation cost by 90.5% compared to exhaustive pairwise comparisons and by 19.8% compared to prior work (when n = 100), while improving or maintaining inter-rater reliability. These results demonstrate that combining CLIP-based priors with uncertainty-aware sampling yields an efficient and scalable solution for pairwise ranking.

* 5 pages, 2 figures, Accepted at CIKM 2025 (ACM International Conference on Information and Knowledge Management)

Via

Access Paper or Ask Questions

Physics-guided and fabrication-aware inverse design of photonic devices using diffusion models

Apr 23, 2025

Dongjin Seo, Soobin Um, Sangbin Lee, Jong Chul Ye, Haejun Chung

Abstract:Designing free-form photonic devices is fundamentally challenging due to the vast number of possible geometries and the complex requirements of fabrication constraints. Traditional inverse-design approaches--whether driven by human intuition, global optimization, or adjoint-based gradient methods--often involve intricate binarization and filtering steps, while recent deep learning strategies demand prohibitively large numbers of simulations (10^5 to 10^6). To overcome these limitations, we present AdjointDiffusion, a physics-guided framework that integrates adjoint sensitivity gradients into the sampling process of diffusion models. AdjointDiffusion begins by training a diffusion network on a synthetic, fabrication-aware dataset of binary masks. During inference, we compute the adjoint gradient of a candidate structure and inject this physics-based guidance at each denoising step, steering the generative process toward high figure-of-merit (FoM) solutions without additional post-processing. We demonstrate our method on two canonical photonic design problems--a bent waveguide and a CMOS image sensor color router--and show that our method consistently outperforms state-of-the-art nonlinear optimizers (such as MMA and SLSQP) in both efficiency and manufacturability, while using orders of magnitude fewer simulations (approximately 2 x 10^2) than pure deep learning approaches (approximately 10^5 to 10^6). By eliminating complex binarization schedules and minimizing simulation overhead, AdjointDiffusion offers a streamlined, simulation-efficient, and fabrication-aware pipeline for next-generation photonic device design. Our open-source implementation is available at https://github.com/dongjin-seo2020/AdjointDiffusion.

* 25 pages, 7 Figures

Via

Access Paper or Ask Questions

BOLDSimNet: Examining Brain Network Similarity between Task and Resting-State fMRI

Apr 02, 2025

Boseong Kim, Debashis Das Chakladar, Haejun Chung, Ikbeom Jang

Abstract:Traditional causal connectivity methods in task-based and resting-state functional magnetic resonance imaging (fMRI) face challenges in accurately capturing directed information flow due to their sensitivity to noise and inability to model multivariate dependencies. These limitations hinder the effective comparison of brain networks between cognitive states, making it difficult to analyze network reconfiguration during task and resting states. To address these issues, we propose BOLDSimNet, a novel framework utilizing Multivariate Transfer Entropy (MTE) to measure causal connectivity and network similarity across different cognitive states. Our method groups functionally similar regions of interest (ROIs) rather than spatially adjacent nodes, improving accuracy in network alignment. We applied BOLDSimNet to fMRI data from 40 healthy controls and found that children exhibited higher similarity scores between task and resting states compared to adolescents, indicating reduced variability in attention shifts. In contrast, adolescents showed more differences between task and resting states in the Dorsal Attention Network (DAN) and the Default Mode Network (DMN), reflecting enhanced network adaptability. These findings emphasize developmental variations in the reconfiguration of the causal brain network, showcasing BOLDSimNet's ability to quantify network similarity and identify attentional fluctuations between different cognitive states.

Via

Access Paper or Ask Questions

Uncertainty-Weighted Mutual Distillation for Multi-View Fusion

Nov 15, 2024

Jiwoong Yang, Haejun Chung, Ikbeom Jang

Abstract:Multi-view learning often faces challenges in effectively leveraging images captured from different angles and locations. This challenge is particularly pronounced when addressing inconsistencies and uncertainties between views. In this paper, we propose a novel Multi-View Uncertainty-Weighted Mutual Distillation (MV-UWMD) method. Our method enhances prediction consistency by performing hierarchical mutual distillation across all possible view combinations, including single-view, partial multi-view, and full multi-view predictions. This introduces an uncertainty-based weighting mechanism through mutual distillation, allowing effective exploitation of unique information from each view while mitigating the impact of uncertain predictions. We extend a CNN-Transformer hybrid architecture to facilitate robust feature learning and integration across multiple view combinations. We conducted extensive experiments using a large, unstructured dataset captured from diverse, non-fixed viewpoints. The results demonstrate that MV-UWMD improves prediction accuracy and consistency compared to existing multi-view learning approaches.

Via

Access Paper or Ask Questions

Calibration of ordinal regression networks

Oct 21, 2024

Daehwan Kim, Haejun Chung, Ikbeom Jang

Abstract:Recent studies have shown that deep neural networks are not well-calibrated and produce over-confident predictions. The miscalibration issue primarily stems from the minimization of cross-entropy, which aims to align predicted softmax probabilities with one-hot labels. In ordinal regression tasks, this problem is compounded by an additional challenge: the expectation that softmax probabilities should exhibit unimodal distribution is not met with cross-entropy. Rather, the ordinal regression literature has focused on unimodality and overlooked calibration. To address these issues, we propose a novel loss function that introduces order-aware calibration, ensuring that prediction confidence adheres to ordinal relationships between classes. It incorporates soft ordinal encoding and label-smoothing-based regularization to enforce both calibration and unimodality. Extensive experiments across three popular ordinal regression benchmarks demonstrate that our approach achieves state-of-the-art calibration without compromising accuracy.

Via

Access Paper or Ask Questions

Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Aug 19, 2024

Junho Moon, Haejun Chung, Ikbeom Jang

Figure 1 for Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Figure 2 for Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Figure 3 for Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Figure 4 for Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Abstract:Facial wrinkle detection plays a crucial role in cosmetic dermatology. Precise manual segmentation of facial wrinkles is challenging and time-consuming, with inherent subjectivity leading to inconsistent results among graders. To address this issue, we propose two solutions. First, we build and release the first public facial wrinkle dataset, `FFHQ-Wrinkle', an extension of the NVIDIA FFHQ dataset. This dataset includes 1,000 images with human labels and 50,000 images with automatically generated weak labels. This dataset can foster the research community to develop advanced wrinkle detection algorithms. Second, we introduce a training strategy for U-Net-like encoder-decoder models to detect wrinkles across the face automatically. Our method employs a two-stage training strategy: texture map pretraining and finetuning on human-labeled data. Initially, we pretrain models on a large dataset with weak labels (N=50k) or masked texture maps generated through computer vision techniques, without human intervention. Subsequently, we finetune the models using human-labeled data (N=1k), which consists of manually labeled wrinkle masks. During finetuning, the network inputs a combination of RGB and masked texture maps, comprising four channels. We effectively combine labels from multiple annotators to minimize subjectivity in manual labeling. Our strategies demonstrate improved segmentation performance in facial wrinkle segmentation both quantitatively and visually compared to existing pretraining methods.

Via

Access Paper or Ask Questions

Wave Interpolation Neural Operator: Interpolated Prediction of Electric Fields Across Untrained Wavelengths

Aug 06, 2024

Joonhyuk Seo, Chanik Kang, Dongjin Seo, Haejun Chung

Abstract:Designing photonic structures requires electromagnetic simulations, which often require high computational costs. Researchers have developed surrogate solvers for predicting electric fields to alleviate the computational issues. However, existing surrogate solvers are limited to performing inference at fixed simulation conditions and require retraining for different conditions. To address this, we propose Wave Interpolation Neural Operator (WINO), a novel surrogate solver enabling simulation condition interpolation across a continuous spectrum of broadband wavelengths. WINO introduces the Fourier Group Convolution Shuffling operator and a new conditioning method to efficiently predict electric fields from both trained and untrained wavelength data, achieving significant improvements in parameter efficiency and spectral interpolation performance. Our model demonstrates approximately 100 times faster performance than traditional finite-difference frequency-domain simulations. Moreover, compared to the state-of-the-art model, we achieve a 74% reduction in parameters and 80.5% improvements in prediction accuracy for untrained wavelengths, and 13.2% improvements for trained wavelengths.

* 9 pages, 5 figures, 4 tables / Appendix: 4 pages, 4 figures, 3 tables

Via

Access Paper or Ask Questions

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Image Synthesis: T1 MRI to Tau-PET

Jun 18, 2024

Symac Kim, Junho Moon, Haejun Chung, Ikbeom Jang

Abstract:Alzheimer's Disease (AD) is the most common form of dementia, characterised by cognitive decline and biomarkers such as tau-proteins. Tau-positron emission tomography (tau-PET), which employs a radiotracer to selectively bind, detect, and visualise tau protein aggregates within the brain, is valuable for early AD diagnosis but is less accessible due to high costs, limited availability, and its invasive nature. Image synthesis with neural networks enables the generation of tau-PET images from more accessible T1-weighted magnetic resonance imaging (MRI) images. To ensure high-quality image synthesis, we propose a cyclic 2.5D perceptual loss combined with mean squared error and structural similarity index measure (SSIM) losses. The cyclic 2.5D perceptual loss sequentially calculates the axial 2D average perceptual loss for a specified number of epochs, followed by the coronal and sagittal planes for the same number of epochs. This sequence is cyclically performed, with intervals reducing as the cycles repeat. We conduct supervised synthesis of tau-PET images from T1w MRI images using 516 paired T1w MRI and tau-PET 3D images from the ADNI database. For the collected data, we perform preprocessing, including intensity standardisation for tau-PET images from each manufacturer. The proposed loss, applied to generative 3D U-Net and its variants, outperformed those with 2.5D and 3D perceptual losses in SSIM and peak signal-to-noise ratio (PSNR). In addition, including the cyclic 2.5D perceptual loss to the original losses of GAN-based image synthesis models such as CycleGAN and Pix2Pix improves SSIM and PSNR by at least 2% and 3%. Furthermore, by-manufacturer PET standardisation helps the models in synthesising high-quality images than min-max PET normalisation.

* 24 pages, 5 figures

Via

Access Paper or Ask Questions

Deep-learning-driven end-to-end metalens imaging

Dec 05, 2023

Joonhyuk Seo, Jaegang Jo, Joohoon Kim, Joonho Kang, Chanik Kang, Seongwon Moon, Eunji Lee, Jehyeong Hong, Junsuk Rho, Haejun Chung

Figure 1 for Deep-learning-driven end-to-end metalens imaging

Figure 2 for Deep-learning-driven end-to-end metalens imaging

Figure 3 for Deep-learning-driven end-to-end metalens imaging

Figure 4 for Deep-learning-driven end-to-end metalens imaging

Abstract:Recent advances in metasurface lenses (metalenses) have shown great potential for opening a new era in compact imaging, photography, light detection and ranging (LiDAR), and virtual reality/augmented reality (VR/AR) applications. However, the fundamental trade-off between broadband focusing efficiency and operating bandwidth limits the performance of broadband metalenses, resulting in chromatic aberration, angular aberration, and a relatively low efficiency. In this study, a deep-learning-based image restoration framework is proposed to overcome these limitations and realize end-to-end metalens imaging, thereby achieving aberration-free full-color imaging for massproduced metalenses with 10-mm diameter. Neural network-assisted metalens imaging achieved a high resolution comparable to that of the ground truth image.

* 15 pages, 5 figures, 1 table

Via

Access Paper or Ask Questions