Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyunsuk Ko

An Adaptive Method Stabilizing Activations for Enhanced Generalization

Jun 10, 2025

Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko

Abstract:We introduce AdaAct, a novel optimization algorithm that adjusts learning rates according to activation variance. Our method enhances the stability of neuron outputs by incorporating neuron-wise adaptivity during the training process, which subsequently leads to better generalization -- a complementary approach to conventional activation regularization methods. Experimental results demonstrate AdaAct's competitive performance across standard image classification benchmarks. We evaluate AdaAct on CIFAR and ImageNet, comparing it with other state-of-the-art methods. Importantly, AdaAct effectively bridges the gap between the convergence speed of Adam and the strong generalization capabilities of SGD, all while maintaining competitive execution times. Code is available at https://github.com/hseung88/adaact.

* 2024 IEEE International Conference on Data Mining Workshops (ICDMW), Abu Dhabi, United Arab Emirates, 2024, pp. 9-16

Via

Access Paper or Ask Questions

NysAct: A Scalable Preconditioned Gradient Descent using Nystrom Approximation

Jun 10, 2025

Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko

Abstract:Adaptive gradient methods are computationally efficient and converge quickly, but they often suffer from poor generalization. In contrast, second-order methods enhance convergence and generalization but typically incur high computational and memory costs. In this work, we introduce NysAct, a scalable first-order gradient preconditioning method that strikes a balance between state-of-the-art first-order and second-order optimization methods. NysAct leverages an eigenvalue-shifted Nystrom method to approximate the activation covariance matrix, which is used as a preconditioning matrix, significantly reducing time and memory complexities with minimal impact on test accuracy. Our experiments show that NysAct not only achieves improved test accuracy compared to both first-order and second-order methods but also demands considerably less computational resources than existing second-order methods. Code is available at https://github.com/hseung88/nysact.

* in 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 1442-1449

Via

Access Paper or Ask Questions

MAC: An Efficient Gradient Preconditioning using Mean Activation Approximated Curvature

Jun 10, 2025

Hyunseok Seung, Jaewoo Lee, Hyunsuk Ko

Abstract:Second-order optimization methods for training neural networks, such as KFAC, exhibit superior convergence by utilizing curvature information of loss landscape. However, it comes at the expense of high computational burden. In this work, we analyze the two components that constitute the layer-wise Fisher information matrix (FIM) used in KFAC: the Kronecker factors related to activations and pre-activation gradients. Based on empirical observations on their eigenspectra, we propose efficient approximations for them, resulting in a computationally efficient optimization method called MAC. To the best of our knowledge, MAC is the first algorithm to apply the Kronecker factorization to the FIM of attention layers used in transformers and explicitly integrate attention scores into the preconditioning. We also study the convergence property of MAC on nonlinear neural networks and provide two conditions under which it converges to global minima. Our extensive evaluations on various network architectures and datasets show that the proposed method outperforms KFAC and other state-of-the-art methods in terms of accuracy, end-to-end training time, and memory usage. Code is available at https://github.com/hseung88/mac.

Via

Access Paper or Ask Questions

Study of Subjective and Objective Quality in Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset

Sep 26, 2024

Yongrok Kim, Junha Shin, Juhyun Lee, Hyunsuk Ko

Figure 1 for Study of Subjective and Objective Quality in Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset

Figure 2 for Study of Subjective and Objective Quality in Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset

Figure 3 for Study of Subjective and Objective Quality in Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset

Figure 4 for Study of Subjective and Objective Quality in Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset

Abstract:To display low-quality broadcast content on high-resolution screens in full-screen format, the application of Super-Resolution (SR), a key consumer technology, is essential. Recently, SR methods have been developed that not only increase resolution while preserving the original image information but also enhance the perceived quality. However, evaluating the quality of SR images generated from low-quality sources, such as SR-enhanced broadcast content, is challenging due to the need to consider both distortions and improvements. Additionally, assessing SR image quality without original high-quality sources presents another significant challenge. Unfortunately, there has been a dearth of research specifically addressing the Image Quality Assessment (IQA) of SR images under these conditions. In this work, we introduce a new IQA dataset for SR broadcast images in both 2K and 4K resolutions. We conducted a subjective quality evaluation to obtain the Mean Opinion Score (MOS) for these SR images and performed a comprehensive human study to identify the key factors influencing the perceived quality. Finally, we evaluated the performance of existing IQA metrics on our dataset. This study reveals the limitations of current metrics, highlighting the need for a more robust IQA metric that better correlates with the perceived quality of SR images.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Space-Time Video Regularity and Visual Fidelity: Compression, Resolution and Frame Rate Adaptation

Mar 31, 2021

Dae Yeol Lee, Hyunsuk Ko, Jongho Kim, Alan C. Bovik

Figure 1 for Space-Time Video Regularity and Visual Fidelity: Compression, Resolution and Frame Rate Adaptation

Figure 2 for Space-Time Video Regularity and Visual Fidelity: Compression, Resolution and Frame Rate Adaptation

Figure 3 for Space-Time Video Regularity and Visual Fidelity: Compression, Resolution and Frame Rate Adaptation

Figure 4 for Space-Time Video Regularity and Visual Fidelity: Compression, Resolution and Frame Rate Adaptation

Abstract:In order to be able to deliver today's voluminous amount of video contents through limited bandwidth channels in a perceptually optimal way, it is important to consider perceptual trade-offs of compression and space-time downsampling protocols. In this direction, we have studied and developed new models of natural video statistics (NVS), which are useful because high-quality videos contain statistical regularities that are disturbed by distortions. Specifically, we model the statistics of divisively normalized difference between neighboring frames that are relatively displaced. In an extensive empirical study, we found that those paths of space-time displaced frame differences that provide maximal regularity against our NVS model generally align best with motion trajectories. Motivated by this, we build a new video quality prediction engine that extracts NVS features from displaced frame differences, and combines them in a learned regressor that can accurately predict perceptual quality. As a stringent test of the new model, we apply it to the difficult problem of predicting the quality of videos subjected not only to compression, but also to downsampling in space and/or time. We show that the new quality model achieves state-of-the-art (SOTA) prediction performance compared on the new ETRI-LIVE Space-Time Subsampled Video Quality (STSVQ) database, which is dedicated to this problem. Downsampling protocols are of high interest to the streaming video industry, given rapid increases in frame resolutions and frame rates.

Via

Access Paper or Ask Questions

A Subjective and Objective Study of Space-Time Subsampled Video Quality

Jan 29, 2021

Dae Yeol Lee, Somdyuti Paul, Christos G. Bampis, Hyunsuk Ko, Jongho Kim, Se Yoon Jeong, Blake Homan, Alan C. Bovik

Figure 1 for A Subjective and Objective Study of Space-Time Subsampled Video Quality

Figure 2 for A Subjective and Objective Study of Space-Time Subsampled Video Quality

Figure 3 for A Subjective and Objective Study of Space-Time Subsampled Video Quality

Figure 4 for A Subjective and Objective Study of Space-Time Subsampled Video Quality

Abstract:Video dimensions are continuously increasing to provide more realistic and immersive experiences to global streaming and social media viewers. However, increments in video parameters such as spatial resolution and frame rate are inevitably associated with larger data volumes. Transmitting increasingly voluminous videos through limited bandwidth networks in a perceptually optimal way is a current challenge affecting billions of viewers. One recent practice adopted by video service providers is space-time resolution adaptation in conjunction with video compression. Consequently, it is important to understand how different levels of space-time subsampling and compression affect the perceptual quality of videos. Towards making progress in this direction, we constructed a large new resource, called the ETRI-LIVE Space-Time Subsampled Video Quality (ETRI-LIVE STSVQ) database, containing 437 videos generated by applying various levels of combined space-time subsampling and video compression on 15 diverse video contents. We also conducted a large-scale human study on the new dataset, collecting about 15,000 subjective judgments of video quality. We provide a rate-distortion analysis of the collected subjective scores, enabling us to investigate the perceptual impact of space-time subsampling at different bit rates. We also evaluated and compared the performance of leading video quality models on the new database.

Via

Access Paper or Ask Questions

On the Space-Time Statistics of Motion Pictures

Jan 29, 2021

Dae Yeol Lee, Hyunsuk Ko, Jongho Kim, Alan C. Bovik

Figure 1 for On the Space-Time Statistics of Motion Pictures

Figure 2 for On the Space-Time Statistics of Motion Pictures

Figure 3 for On the Space-Time Statistics of Motion Pictures

Figure 4 for On the Space-Time Statistics of Motion Pictures

Abstract:It is well-known that natural images possess statistical regularities that can be captured by bandpass decomposition and divisive normalization processes that approximate early neural processing in the human visual system. We expand on these studies and present new findings on the properties of space-time natural statistics that are inherent in motion pictures. Our model relies on the concept of temporal bandpass (e.g. lag) filtering in LGN and area V1, which is similar to smoothed frame differencing of video frames. Specifically, we model the statistics of the differences between adjacent or neighboring video frames that have been slightly spatially displaced relative to one another. We find that when these space-time differences are further subjected to locally pooled divisive normalization, statistical regularities (or lack thereof) arise that depend on the local motion trajectory. We find that bandpass and divisively normalized frame-differences that are displaced along the motion direction exhibit stronger statistical regularities than for other displacements. Conversely, the direction-dependent regularities of displaced frame differences can be used to estimate the image motion (optical flow) by finding the space-time displacement paths that best preserve statistical regularity.

Via

Access Paper or Ask Questions

A ParaBoost Stereoscopic Image Quality Assessment (PBSIQA) System

Mar 31, 2016

Hyunsuk Ko, Rui Song, C. -C. Jay Kuo

Figure 1 for A ParaBoost Stereoscopic Image Quality Assessment (PBSIQA) System

Figure 2 for A ParaBoost Stereoscopic Image Quality Assessment (PBSIQA) System

Figure 3 for A ParaBoost Stereoscopic Image Quality Assessment (PBSIQA) System

Figure 4 for A ParaBoost Stereoscopic Image Quality Assessment (PBSIQA) System

Abstract:The problem of stereoscopic image quality assessment, which finds applications in 3D visual content delivery such as 3DTV, is investigated in this work. Specifically, we propose a new ParaBoost (parallel-boosting) stereoscopic image quality assessment (PBSIQA) system. The system consists of two stages. In the first stage, various distortions are classified into a few types, and individual quality scorers targeting at a specific distortion type are developed. These scorers offer complementary performance in face of a database consisting of heterogeneous distortion types. In the second stage, scores from multiple quality scorers are fused to achieve the best overall performance, where the fuser is designed based on the parallel boosting idea borrowed from machine learning. Extensive experimental results are conducted to compare the performance of the proposed PBSIQA system with those of existing stereo image quality assessment (SIQA) metrics. The developed quality metric can serve as an objective function to optimize the performance of a 3D content delivery system.

Via

Access Paper or Ask Questions

Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Mar 31, 2016

Hyunsuk Ko, Han Suk Shim, Ouk Choi, C. -C. Jay Kuo

Figure 1 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Figure 2 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Figure 3 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Figure 4 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Abstract:A novel algorithm for uncalibrated stereo image-pair rectification under the constraint of geometric distortion, called USR-CGD, is presented in this work. Although it is straightforward to define a rectifying transformation (or homography) given the epipolar geometry, many existing algorithms have unwanted geometric distortions as a side effect. To obtain rectified images with reduced geometric distortions while maintaining a small rectification error, we parameterize the homography by considering the influence of various kinds of geometric distortions. Next, we define several geometric measures and incorporate them into a new cost function for parameter optimization. Finally, we propose a constrained adaptive optimization scheme to allow a balanced performance between the rectification error and the geometric error. Extensive experimental results are provided to demonstrate the superb performance of the proposed USR-CGD method, which outperforms existing algorithms by a significant margin.

Via

Access Paper or Ask Questions

MCL-3D: a database for stereoscopic image quality assessment using 2D-image-plus-depth source

Mar 23, 2014

Rui Song, Hyunsuk Ko, C. C. Jay Kuo

Figure 1 for MCL-3D: a database for stereoscopic image quality assessment using 2D-image-plus-depth source

Figure 2 for MCL-3D: a database for stereoscopic image quality assessment using 2D-image-plus-depth source

Figure 3 for MCL-3D: a database for stereoscopic image quality assessment using 2D-image-plus-depth source

Figure 4 for MCL-3D: a database for stereoscopic image quality assessment using 2D-image-plus-depth source

Abstract:A new stereoscopic image quality assessment database rendered using the 2D-image-plus-depth source, called MCL-3D, is described and the performance benchmarking of several known 2D and 3D image quality metrics using the MCL-3D database is presented in this work. Nine image-plus-depth sources are first selected, and a depth image-based rendering (DIBR) technique is used to render stereoscopic image pairs. Distortions applied to either the texture image or the depth image before stereoscopic image rendering include: Gaussian blur, additive white noise, down-sampling blur, JPEG and JPEG-2000 (JP2K) compression and transmission error. Furthermore, the distortion caused by imperfect rendering is also examined. The MCL-3D database contains 693 stereoscopic image pairs, where one third of them are of resolution 1024x728 and two thirds are of resolution 1920x1080. The pair-wise comparison was adopted in the subjective test for user friendliness, and the Mean Opinion Score (MOS) can be computed accordingly. Finally, we evaluate the performance of several 2D and 3D image quality metrics applied to MCL-3D. All texture images, depth images, rendered image pairs in MCL-3D and their MOS values obtained in the subjective test are available to the public (http://mcl.usc.edu/mcl-3d-database/) for future research and development.

Via

Access Paper or Ask Questions