Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jaehyun Park

Active Learning for Continual Learning: Keeping the Past Alive in the Present

Jan 24, 2025

Jaehyun Park, Dongmin Park, Jae-Gil Lee

Abstract:Continual learning (CL) enables deep neural networks to adapt to ever-changing data distributions. In practice, there may be scenarios where annotation is costly, leading to active continual learning (ACL), which performs active learning (AL) for the CL scenarios when reducing the labeling cost by selecting the most informative subset is preferable. However, conventional AL strategies are not suitable for ACL, as they focus solely on learning the new knowledge, leading to catastrophic forgetting of previously learned tasks. Therefore, ACL requires a new AL strategy that can balance the prevention of catastrophic forgetting and the ability to quickly learn new tasks. In this paper, we propose AccuACL, Accumulated informativeness-based Active Continual Learning, by the novel use of the Fisher information matrix as a criterion for sample selection, derived from a theoretical analysis of the Fisher-optimality preservation properties within the framework of ACL, while also addressing the scalability issue of Fisher information-based AL. Extensive experiments demonstrate that AccuACL significantly outperforms AL baselines across various CL algorithms, increasing the average accuracy and forgetting by 23.8% and 17.0%, respectively, in average.

Via

Access Paper or Ask Questions

Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task

Oct 15, 2024

Yunho Kim, Jaehyun Park, Heejun Kim, Sejin Kim, Byung-Jun Lee, Sundong Kim

Figure 1 for Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task

Figure 2 for Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task

Figure 3 for Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task

Figure 4 for Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task

Abstract:Effective long-term strategies enable AI systems to navigate complex environments by making sequential decisions over extended horizons. Similarly, reinforcement learning (RL) agents optimize decisions across sequences to maximize rewards, even without immediate feedback. To verify that Latent Diffusion-Constrained Q-learning (LDCQ), a prominent diffusion-based offline RL method, demonstrates strong reasoning abilities in multi-step decision-making, we aimed to evaluate its performance on the Abstraction and Reasoning Corpus (ARC). However, applying offline RL methodologies to enhance strategic reasoning in AI for solving tasks in ARC is challenging due to the lack of sufficient experience data in the ARC training set. To address this limitation, we introduce an augmented offline RL dataset for ARC, called Synthesized Offline Learning Data for Abstraction and Reasoning (SOLAR), along with the SOLAR-Generator, which generates diverse trajectory data based on predefined rules. SOLAR enables the application of offline RL methods by offering sufficient experience data. We synthesized SOLAR for a simple task and used it to train an agent with the LDCQ method. Our experiments demonstrate the effectiveness of the offline RL approach on a simple ARC task, showing the agent's ability to make multi-step sequential decisions and correctly identify answer states. These results highlight the potential of the offline RL approach to enhance AI's strategic reasoning capabilities.

* Preprint, Under review. Comments welcome

Via

Access Paper or Ask Questions

DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Oct 15, 2024

Jaehyun Park, Yunho Kim, Sejin Kim, Byung-Jun Lee, Sundong Kim

Figure 1 for DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Figure 2 for DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Figure 3 for DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Figure 4 for DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Abstract:We propose a novel offline reinforcement learning (offline RL) approach, introducing the Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation (DIAR) framework. We address two key challenges in offline RL: out-of-distribution samples and long-horizon problems. We leverage diffusion models to learn state-action sequence distributions and incorporate value functions for more balanced and adaptive decision-making. DIAR introduces an Adaptive Revaluation mechanism that dynamically adjusts decision lengths by comparing current and future state values, enabling flexible long-term decision-making. Furthermore, we address Q-value overestimation by combining Q-network learning with a value function guided by a diffusion model. The diffusion model generates diverse latent trajectories, enhancing policy robustness and generalization. As demonstrated in tasks like Maze2D, AntMaze, and Kitchen, DIAR consistently outperforms state-of-the-art algorithms in long-horizon, sparse-reward environments.

* Preprint, under review. Comments welcome

Via

Access Paper or Ask Questions

Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation

Jul 07, 2024

Jin Woo Lee, Jaehyun Park, Min Jun Choi, Kyogu Lee

Figure 1 for Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation

Figure 2 for Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation

Figure 3 for Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation

Figure 4 for Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation

Abstract:While significant advancements have been made in music generation and differentiable sound synthesis within machine learning and computer audition, the simulation of instrument vibration guided by physical laws has been underexplored. To address this gap, we introduce a novel model for simulating the spatio-temporal motion of nonlinear strings, integrating modal synthesis and spectral modeling within a neural network framework. Our model leverages physical properties and fundamental frequencies as inputs, outputting string states across time and space that solve the partial differential equation characterizing the nonlinear string. Empirical evaluations demonstrate that the proposed architecture achieves superior accuracy in string motion simulation compared to existing baseline architectures. The code and demo are available online.

Via

Access Paper or Ask Questions

Reinforcement Learning for Infinite-Horizon Average-Reward MDPs with Multinomial Logistic Function Approximation

Jun 19, 2024

Jaehyun Park, Dabeen Lee

Figure 1 for Reinforcement Learning for Infinite-Horizon Average-Reward MDPs with Multinomial Logistic Function Approximation

Figure 2 for Reinforcement Learning for Infinite-Horizon Average-Reward MDPs with Multinomial Logistic Function Approximation

Figure 3 for Reinforcement Learning for Infinite-Horizon Average-Reward MDPs with Multinomial Logistic Function Approximation

Abstract:We study model-based reinforcement learning with non-linear function approximation where the transition function of the underlying Markov decision process (MDP) is given by a multinomial logistic (MNL) model. In this paper, we develop two algorithms for the infinite-horizon average reward setting. Our first algorithm \texttt{UCRL2-MNL} applies to the class of communicating MDPs and achieves an $\tilde{\mathcal{O}}(dD\sqrt{T})$ regret, where $d$ is the dimension of feature mapping, $D$ is the diameter of the underlying MDP, and $T$ is the horizon. The second algorithm \texttt{OVIFH-MNL} is computationally more efficient and applies to the more general class of weakly communicating MDPs, for which we show a regret guarantee of $\tilde{\mathcal{O}}(d^{2/5} \mathrm{sp}(v^*)T^{4/5})$ where $\mathrm{sp}(v^*)$ is the span of the associated optimal bias function. We also prove a lower bound of $\Omega(d\sqrt{DT})$ for learning communicating MDPs with MNL transitions of diameter at most $D$. Furthermore, we show a regret lower bound of $\Omega(dH^{3/2}\sqrt{K})$ for learning $H$-horizon episodic MDPs with MNL function approximation where $K$ is the number of episodes, which improves upon the best-known lower bound for the finite-horizon setting.

Via

Access Paper or Ask Questions

Understanding Contrastive Learning Through the Lens of Margins

Jun 20, 2023

Daniel Rho, TaeSoo Kim, Sooill Park, Jaehyun Park, JaeHan Park

Abstract:Self-supervised learning, or SSL, holds the key to expanding the usage of machine learning in real-world tasks by alleviating heavy human supervision. Contrastive learning and its varieties have been SSL strategies in various fields. We use margins as a stepping stone for understanding how contrastive learning works at a deeper level and providing potential directions to improve representation learning. Through gradient analysis, we found that margins scale gradients in three different ways: emphasizing positive samples, de-emphasizing positive samples when angles of positive samples are wide, and attenuating the diminishing gradients as the estimated probability approaches the target probability. We separately analyze each and provide possible directions for improving SSL frameworks. Our experimental results demonstrate that these properties can contribute to acquiring better representations, which can enhance performance in both seen and unseen datasets.

Via

Access Paper or Ask Questions

Unraveling the ARC Puzzle: Mimicking Human Solutions with Object-Centric Decision Transformer

Jun 14, 2023

Jaehyun Park, Jaegyun Im, Sanha Hwang, Mintaek Lim, Sabina Ualibekova, Sejin Kim, Sundong Kim

Figure 1 for Unraveling the ARC Puzzle: Mimicking Human Solutions with Object-Centric Decision Transformer

Figure 2 for Unraveling the ARC Puzzle: Mimicking Human Solutions with Object-Centric Decision Transformer

Figure 3 for Unraveling the ARC Puzzle: Mimicking Human Solutions with Object-Centric Decision Transformer

Figure 4 for Unraveling the ARC Puzzle: Mimicking Human Solutions with Object-Centric Decision Transformer

Abstract:In the pursuit of artificial general intelligence (AGI), we tackle Abstraction and Reasoning Corpus (ARC) tasks using a novel two-pronged approach. We employ the Decision Transformer in an imitation learning paradigm to model human problem-solving, and introduce an object detection algorithm, the Push and Pull clustering method. This dual strategy enhances AI's ARC problem-solving skills and provides insights for AGI progression. Yet, our work reveals the need for advanced data collection tools, robust training datasets, and refined model structures. This study highlights potential improvements for Decision Transformers and propels future AGI research.

Via

Access Paper or Ask Questions

Blind Estimation of Audio Processing Graph

Mar 15, 2023

Sungho Lee, Jaehyun Park, Seungryeol Paik, Kyogu Lee

Abstract:Musicians and audio engineers sculpt and transform their sounds by connecting multiple processors, forming an audio processing graph. However, most deep-learning methods overlook this real-world practice and assume fixed graph settings. To bridge this gap, we develop a system that reconstructs the entire graph from a given reference audio. We first generate a realistic graph-reference pair dataset and train a simple blind estimation system composed of a convolutional reference encoder and a transformer-based graph decoder. We apply our model to singing voice effects and drum mixing estimation tasks. Evaluation results show that our method can reconstruct complex signal routings, including multi-band processing and sidechaining.

* Accepted to ICASSP 2023

Via

Access Paper or Ask Questions

Analysis of UAV Radar and Communication Network Coexistence with Different Multiple Access Protocols

Nov 29, 2022

Sung Joon Maeng, Jaehyun Park, Ismail Guvenc

Abstract:Unmanned aerial vehicles (UAVs) are expected to be used extensively in the future for various applications, either as user equipment (UEs) connected to a cellular wireless network, or as an infrastructure extension of an existing wireless network to serve other UEs. Next generation wireless networks will consider the use of UAVs for joint communication and radar and/or as dedicated radars for various sensing applications. Increasing number of UAVs will naturally result in larger number of communication and/or radar links that may cause interference to nearby networks, exacerbated further by the higher likelihood of line-of-sight signal propagation from UAVs even to distant receivers. With all these, it is critical to study network coexistence of UAV-mounted base stations (BSs) and radar transceivers. In this paper, using stochastic geometry, we derive closed-form expressions to characterize the performance of coexisting UAV radar and communication networks for spectrum overlay multiple access (SOMA) and time-division multiple access (TDMA). We evaluate successful ranging probability (SRP) and the transmission capacity (TC) and compare the performance of TDMA and SOMA. Our results show that SOMA can outperform TDMA on both SRP and TC when the node density of active UAV-radars is larger than the node density of UAV-comms.

Via

Access Paper or Ask Questions

ASAP: Accurate semantic segmentation for real time performance

Oct 04, 2022

Jaehyun Park, Subin Lee, Eon Kim, Byeongjun Moon, Dabeen Yu, Yeonseung Yu, Junghwan Kim

Figure 1 for ASAP: Accurate semantic segmentation for real time performance

Figure 2 for ASAP: Accurate semantic segmentation for real time performance

Figure 3 for ASAP: Accurate semantic segmentation for real time performance

Figure 4 for ASAP: Accurate semantic segmentation for real time performance

Abstract:Feature fusion modules from encoder and self-attention module have been adopted in semantic segmentation. However, the computation of these modules is costly and has operational limitations in real-time environments. In addition, segmentation performance is limited in autonomous driving environments with a lot of contextual information perpendicular to the road surface, such as people, buildings, and general objects. In this paper, we propose an efficient feature fusion method, Feature Fusion with Different Norms (FFDN) that utilizes rich global context of multi-level scale and vertical pooling module before self-attention that preserves most contextual information while reducing the complexity of global context encoding in the vertical direction. By doing this, we could handle the properties of representation in global space and reduce additional computational cost. In addition, we analyze low performance in challenging cases including small and vertically featured objects. We achieve the mean Interaction of-union(mIoU) of 73.1 and the Frame Per Second(FPS) of 191, which are comparable results with state-of-the-arts on Cityscapes test datasets.

* 5 pages, 4 figures

Via

Access Paper or Ask Questions