Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yeonjoon Jung

GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

May 26, 2025

Yeonjoon Jung, Daehyun Ahn, Hyungjun Kim, Taesu Kim, Eunhyeok Park

Abstract:Low-Rank Adaptation (LoRA) is a popular method for parameter-efficient fine-tuning (PEFT) of generative models, valued for its simplicity and effectiveness. Despite recent enhancements, LoRA still suffers from a fundamental limitation: overfitting when the bottleneck is widened. It performs best at ranks 32-64, yet its accuracy stagnates or declines at higher ranks, still falling short of full fine-tuning (FFT) performance. We identify the root cause as LoRA's structural bottleneck, which introduces gradient entanglement to the unrelated input channels and distorts gradient propagation. To address this, we introduce a novel structure, Granular Low-Rank Adaptation (GraLoRA) that partitions weight matrices into sub-blocks, each with its own low-rank adapter. With negligible computational or storage cost, GraLoRA overcomes LoRA's limitations, effectively increases the representational capacity, and more closely approximates FFT behavior. Experiments on code generation and commonsense reasoning benchmarks show that GraLoRA consistently outperforms LoRA and other baselines, achieving up to +8.5% absolute gain in Pass@1 on HumanEval+. These improvements hold across model sizes and rank settings, making GraLoRA a scalable and robust solution for PEFT. Code, data, and scripts are available at https://github.com/SqueezeBits/GraLoRA.git

Via

Access Paper or Ask Questions

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

Oct 21, 2024

Yeonjoon Jung, Jaeseong Lee, Seungtaek Choi, Dohyeon Lee, Minsoo Kim, Seung-won Hwang

Abstract:Recently, pre-trained language models (PLMs) have been increasingly adopted in spoken language understanding (SLU). However, automatic speech recognition (ASR) systems frequently produce inaccurate transcriptions, leading to noisy inputs for SLU models, which can significantly degrade their performance. To address this, our objective is to train SLU models to withstand ASR errors by exposing them to noises commonly observed in ASR systems, referred to as ASR-plausible noises. Speech noise injection (SNI) methods have pursued this objective by introducing ASR-plausible noises, but we argue that these methods are inherently biased towards specific ASR systems, or ASR-specific noises. In this work, we propose a novel and less biased augmentation method of introducing the noises that are plausible to any ASR system, by cutting off the non-causal effect of noises. Experimental results and analyses demonstrate the effectiveness of our proposed methods in enhancing the robustness and generalizability of SLU models against unseen ASR systems by introducing more diverse and plausible ASR noises in advance.

* 9 pages, 3 figures

Via

Access Paper or Ask Questions

Triplet Edge Attention for Algorithmic Reasoning

Dec 09, 2023

Yeonjoon Jung, Sungsoo Ahn

Abstract:This work investigates neural algorithmic reasoning to develop neural networks capable of learning from classical algorithms. The main challenge is to develop graph neural networks that are expressive enough to predict the given algorithm outputs while generalizing well to out-of-distribution data. In this work, we introduce a new graph neural network layer called Triplet Edge Attention (TEA), an edge-aware graph attention layer. Our algorithm works by precisely computing edge latent, aggregating multiple triplet messages using edge-based attention. We empirically validate our TEA layer in the CLRS benchmark and demonstrate a $5%$ improvement on average. In particular, we achieve a $30%$ improvement for the string algorithms compared to the state-of-the-art model.

Via

Access Paper or Ask Questions