Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Huije Lee

Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding

Feb 08, 2025

Sukmin Cho, Sangjin Choi, Taeho Hwang, Jeongyeon Seo, Soyeong Jeong, Huije Lee, Hoyun Song, Jong C. Park, Youngjin Kwon

Abstract:Accelerating inference in Large Language Models (LLMs) is critical for real-time interactions, as they have been widely incorporated into real-world services. Speculative decoding, a fully algorithmic solution, has gained attention for improving inference speed by drafting and verifying tokens, thereby generating multiple tokens in a single forward pass. However, current drafting strategies usually require significant fine-tuning or have inconsistent performance across tasks. To address these challenges, we propose Hierarchy Drafting (HD), a novel lossless drafting approach that organizes various token sources into multiple databases in a hierarchical framework based on temporal locality. In the drafting step, HD sequentially accesses multiple databases to obtain draft tokens from the highest to the lowest locality, ensuring consistent acceleration across diverse tasks and minimizing drafting latency. Our experiments on Spec-Bench using LLMs with 7B and 13B parameters demonstrate that HD outperforms existing database drafting methods, achieving robust inference speedups across model sizes, tasks, and temperatures.

* Findings of NAACL 2025

Via

Access Paper or Ask Questions

Towards Effective Counter-Responses: Aligning Human Preferences with Strategies to Combat Online Trolling

Oct 05, 2024

Huije Lee, Hoyun Song, Jisu Shin, Sukmin Cho, SeungYoon Han, Jong C. Park

Abstract:Trolling in online communities typically involves disruptive behaviors such as provoking anger and manipulating discussions, leading to a polarized atmosphere and emotional distress. Robust moderation is essential for mitigating these negative impacts and maintaining a healthy and constructive community atmosphere. However, effectively addressing trolls is difficult because their behaviors vary widely and require different response strategies (RSs) to counter them. This diversity makes it challenging to choose an appropriate RS for each specific situation. To address this challenge, our research investigates whether humans have preferred strategies tailored to different types of trolling behaviors. Our findings reveal a correlation between the types of trolling encountered and the preferred RS. In this paper, we introduce a methodology for generating counter-responses to trolls by recommending appropriate RSs, supported by a dataset aligning these strategies with human preferences across various troll contexts. The experimental results demonstrate that our proposed approach guides constructive discussion and reduces the negative effects of trolls, thereby enhancing the online community environment.

* Findings of EMNLP 2024

Via

Access Paper or Ask Questions

Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

Jul 03, 2024

Eui Jun Hwang, Sukmin Cho, Huije Lee, Youngwoo Yoon, Jong C. Park

Figure 1 for Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

Figure 2 for Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

Figure 3 for Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

Figure 4 for Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

Abstract:Sign language, essential for the deaf and hard-of-hearing, presents unique challenges in translation and production due to its multimodal nature and the inherent ambiguity in mapping sign language motion to spoken language words. Previous methods often rely on gloss annotations, requiring time-intensive labor and specialized expertise in sign language. Gloss-free methods have emerged to address these limitations, but they often depend on external sign language data or dictionaries, failing to completely eliminate the need for gloss annotations. There is a clear demand for a comprehensive approach that can supplant gloss annotations and be utilized for both Sign Language Translation (SLT) and Sign Language Production (SLP). We introduce Universal Gloss-level Representation (UniGloR), a unified and self-supervised solution for both SLT and SLP, trained on multiple datasets including PHOENIX14T, How2Sign, and NIASL2021. Our results demonstrate UniGloR's effectiveness in the translation and production tasks. We further report an encouraging result for the Sign Language Recognition (SLR) on previously unseen data. Our study suggests that self-supervised learning can be made in a unified manner, paving the way for innovative and practical applications in future research.

* 14 pages, 5 figures

Via

Access Paper or Ask Questions

Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

Jun 06, 2024

Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, Jong C. Park

Figure 1 for Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

Figure 2 for Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

Figure 3 for Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

Figure 4 for Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

Abstract:Social bias is shaped by the accumulation of social perceptions towards targets across various demographic identities. To fully understand such social bias in large language models (LLMs), it is essential to consider the composite of social perceptions from diverse perspectives among identities. Previous studies have either evaluated biases in LLMs by indirectly assessing the presence of sentiments towards demographic identities in the generated text or measuring the degree of alignment with given stereotypes. These methods have limitations in directly quantifying social biases at the level of distinct perspectives among identities. In this paper, we aim to investigate how social perceptions from various viewpoints contribute to the development of social bias in LLMs. To this end, we propose a novel strategy to intuitively quantify these social perceptions and suggest metrics that can evaluate the social biases within LLMs by aggregating diverse social perceptions. The experimental results show the quantitative demonstration of the social attitude in LLMs by examining social perception. The analysis we conducted shows that our proposed metrics capture the multi-dimensional aspects of social bias, enabling a fine-grained and comprehensive investigation of bias in LLMs.

* Findings of ACL 2024

Via

Access Paper or Ask Questions

Autoregressive Sign Language Production: A Gloss-Free Approach with Discrete Representations

Sep 21, 2023

Eui Jun Hwang, Huije Lee, Jong C. Park

Abstract:Gloss-free Sign Language Production (SLP) offers a direct translation of spoken language sentences into sign language, bypassing the need for gloss intermediaries. This paper presents the Sign language Vector Quantization Network, a novel approach to SLP that leverages Vector Quantization to derive discrete representations from sign pose sequences. Our method, rooted in both manual and non-manual elements of signing, supports advanced decoding methods and integrates latent-level alignment for enhanced linguistic coherence. Through comprehensive evaluations, we demonstrate superior performance of our method over prior SLP methods and highlight the reliability of Back-Translation and Fr\'echet Gesture Distance as evaluation metrics.

* 5 pages, 3 figures, 6 tables

Via

Access Paper or Ask Questions

A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires

Jun 05, 2023

Hoyun Song, Jisu Shin, Huije Lee, Jong C. Park

Abstract:Social media is one of the most highly sought resources for analyzing characteristics of the language by its users. In particular, many researchers utilized various linguistic features of mental health problems from social media. However, existing approaches to detecting mental disorders face critical challenges, such as the scarcity of high-quality data or the trade-off between addressing the complexity of models and presenting interpretable results grounded in expert domain knowledge. To address these challenges, we design a simple but flexible model that preserves domain-based interpretability. We propose a novel approach that captures the semantic meanings directly from the text and compares them to symptom-related descriptions. Experimental results demonstrate that our model outperforms relevant baselines on various mental disorder detection tasks. Our detailed analysis shows that the proposed model is effective at leveraging domain knowledge, transferable to other mental disorders, and providing interpretable detection results.

* ACL 2023, 15 pages, 11 tables, 4 figures

Via

Access Paper or Ask Questions

ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls

Aug 02, 2022

Huije Lee, Young Ju NA, Hoyun Song, Jisu Shin, Jong C. Park

Figure 1 for ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls

Figure 2 for ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls

Figure 3 for ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls

Figure 4 for ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls

Abstract:Online trolls increase social costs and cause psychological damage to individuals. With the proliferation of automated accounts making use of bots for trolling, it is difficult for targeted individual users to handle the situation both quantitatively and qualitatively. To address this issue, we focus on automating the method to counter trolls, as counter responses to combat trolls encourage community users to maintain ongoing discussion without compromising freedom of expression. For this purpose, we propose a novel dataset for automatic counter response generation. In particular, we constructed a pair-wise dataset that includes troll comments and counter responses with labeled response strategies, which enables models fine-tuned on our dataset to generate responses by varying counter responses according to the specified strategy. We conducted three tasks to assess the effectiveness of our dataset and evaluated the results through both automatic and human evaluation. In human evaluation, we demonstrate that the model fine-tuned on our dataset shows a significantly improved performance in strategy-controlled sentence generation.

* Accepted for LREC 2022

Via

Access Paper or Ask Questions