Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yukyung Lee

A Gradient Accumulation Method for Dense Retriever under Memory Constraint

Jun 18, 2024

Jaehee Kim, Yukyung Lee, Pilsung Kang

Figure 1 for A Gradient Accumulation Method for Dense Retriever under Memory Constraint

Figure 2 for A Gradient Accumulation Method for Dense Retriever under Memory Constraint

Figure 3 for A Gradient Accumulation Method for Dense Retriever under Memory Constraint

Figure 4 for A Gradient Accumulation Method for Dense Retriever under Memory Constraint

Abstract:InfoNCE loss is commonly used to train dense retriever in information retrieval tasks. It is well known that a large batch is essential to stable and effective training with InfoNCE loss, which requires significant hardware resources. Due to the dependency of large batch, dense retriever has bottleneck of application and research. Recently, memory reduction methods have been broadly adopted to resolve the hardware bottleneck by decomposing forward and backward or using a memory bank. However, current methods still suffer from slow and unstable training. To address these issues, we propose \longmodelname\ (\modelname), a stable and efficient memory reduction method for dense retriever trains that uses a dual memory bank structure to leverage previously generated query and passage representations. Experiments on widely used five information retrieval datasets indicate that \modelname\ can surpass not only existing memory reduction methods but also high-resource scenario. Moreover, theoretical analysis and experimental results confirm that \modelname\ provides more stable dual-encoder training than current memory bank utilization methods.

Via

Access Paper or Ask Questions

Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models

Apr 22, 2024

Yukyung Lee, Soonwon Ka, Bokyung Son, Pilsung Kang, Jaewook Kang

Abstract:Large Language Models (LLMs) have significantly impacted the writing process, enabling collaborative content creation and enhancing productivity. However, generating high-quality, user-aligned text remains challenging. In this paper, we propose Writing Path, a framework that uses explicit outlines to guide LLMs in generating goal-oriented, high-quality pieces of writing. Our approach draws inspiration from structured writing planning and reasoning paths, focusing on capturing and reflecting user intentions throughout the writing process. We construct a diverse dataset from unstructured blog posts to benchmark writing performance and introduce a comprehensive evaluation framework assessing the quality of outlines and generated texts. Our evaluations with GPT-3.5-turbo, GPT-4, and HyperCLOVA X demonstrate that the Writing Path approach significantly enhances text quality according to both LLMs and human evaluations. This study highlights the potential of integrating writing-specific techniques into LLMs to enhance their ability to meet the diverse writing needs of users.

* under review

Via

Access Paper or Ask Questions

CheckEval: Robust Evaluation Framework using Large Language Model via Checklist

Mar 27, 2024

Yukyung Lee, Joonghoon Kim, Jaehee Kim, Hyowon Cho, Pilsung Kang

Figure 1 for CheckEval: Robust Evaluation Framework using Large Language Model via Checklist

Figure 2 for CheckEval: Robust Evaluation Framework using Large Language Model via Checklist

Figure 3 for CheckEval: Robust Evaluation Framework using Large Language Model via Checklist

Figure 4 for CheckEval: Robust Evaluation Framework using Large Language Model via Checklist

Abstract:We introduce CheckEval, a novel evaluation framework using Large Language Models, addressing the challenges of ambiguity and inconsistency in current evaluation methods. CheckEval addresses these challenges by dividing evaluation criteria into detailed sub-aspects and constructing a checklist of Boolean questions for each, simplifying the evaluation. This approach not only renders the process more interpretable but also significantly enhances the robustness and reliability of results by focusing on specific evaluation dimensions. Validated through a focused case study using the SummEval benchmark, CheckEval indicates a strong correlation with human judgments. Furthermore, it demonstrates a highly consistent Inter-Annotator Agreement. These findings highlight the effectiveness of CheckEval for objective, flexible, and precise evaluations. By offering a customizable and interactive framework, CheckEval sets a new standard for the use of LLMs in evaluation, responding to the evolving needs of the field and establishing a clear method for future LLM-based evaluation.

* HEAL at CHI 2024

Via

Access Paper or Ask Questions

RAPID: Training-free Retrieval-based Log Anomaly Detection with PLM considering Token-level information

Nov 09, 2023

Gunho No, Yukyung Lee, Hyeongwon Kang, Pilsung Kang

Abstract:As the IT industry advances, system log data becomes increasingly crucial. Many computer systems rely on log texts for management due to restricted access to source code. The need for log anomaly detection is growing, especially in real-world applications, but identifying anomalies in rapidly accumulating logs remains a challenging task. Traditional deep learning-based anomaly detection models require dataset-specific training, leading to corresponding delays. Notably, most methods only focus on sequence-level log information, which makes the detection of subtle anomalies harder, and often involve inference processes that are difficult to utilize in real-time. We introduce RAPID, a model that capitalizes on the inherent features of log data to enable anomaly detection without training delays, ensuring real-time capability. RAPID treats logs as natural language, extracting representations using pre-trained language models. Given that logs can be categorized based on system context, we implement a retrieval-based technique to contrast test logs with the most similar normal logs. This strategy not only obviates the need for log-specific training but also adeptly incorporates token-level information, ensuring refined and robust detection, particularly for unseen logs. We also propose the core set technique, which can reduce the computational cost needed for comparison. Experimental results show that even without training on log data, RAPID demonstrates competitive performance compared to prior models and achieves the best performance on certain datasets. Through various research questions, we verified its capability for real-time detection without delay.

Via

Access Paper or Ask Questions

Painsight: An Extendable Opinion Mining Framework for Detecting Pain Points Based on Online Customer Reviews

Jun 03, 2023

Yukyung Lee, Jaehee Kim, Doyoon Kim, Yookyung Kho, Younsun Kim, Pilsung Kang

Abstract:As the e-commerce market continues to expand and online transactions proliferate, customer reviews have emerged as a critical element in shaping the purchasing decisions of prospective buyers. Previous studies have endeavored to identify key aspects of customer reviews through the development of sentiment analysis models and topic models. However, extracting specific dissatisfaction factors remains a challenging task. In this study, we delineate the pain point detection problem and propose Painsight, an unsupervised framework for automatically extracting distinct dissatisfaction factors from customer reviews without relying on ground truth labels. Painsight employs pre-trained language models to construct sentiment analysis and topic models, leveraging attribution scores derived from model gradients to extract dissatisfaction factors. Upon application of the proposed methodology to customer review data spanning five product categories, we successfully identified and categorized dissatisfaction factors within each group, as well as isolated factors for each type. Notably, Painsight outperformed benchmark methods, achieving substantial performance enhancements and exceptional results in human evaluations.

* WASSA at ACL 2023

Via

Access Paper or Ask Questions

DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Jul 08, 2022

Yukyung Lee, Takyoung Kim, Hoonsang Yoon, Pilsung Kang, Junseong Bang, Misuk Kim

Figure 1 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Figure 2 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Figure 3 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Figure 4 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Abstract:Dialogue state tracking (DST) is a core sub-module of a dialogue system, which aims to extract the appropriate belief state (domain-slot-value) from a system and user utterances. Most previous studies have attempted to improve performance by increasing the size of the pre-trained model or using additional features such as graph relations. In this study, we propose dialogue state tracking with entity adaptive pre-training (DSTEA), a system in which key entities in a sentence are more intensively trained by the encoder of the DST model. DSTEA extracts important entities from input dialogues in four ways, and then applies selective knowledge masking to train the model effectively. Although DSTEA conducts only pre-training without directly infusing additional knowledge to the DST model, it achieved better performance than the best-known benchmark models on MultiWOZ 2.0, 2.1, and 2.2. The effectiveness of DSTEA was verified through various comparative experiments with regard to the entity type and different adaptive settings.

Via

Access Paper or Ask Questions

Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Mar 31, 2022

Takyoung Kim, Hoonsang Yoon, Yukyung Lee, Pilsung Kang, Misuk Kim

Figure 1 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Figure 2 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Figure 3 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Figure 4 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Abstract:Dialogue state tracking (DST) aims to extract essential information from multi-turn dialogue situations and take appropriate actions. A belief state, one of the core pieces of information, refers to the subject and its specific content, and appears in the form of domain-slot-value. The trained model predicts "accumulated" belief states in every turn, and joint goal accuracy and slot accuracy are mainly used to evaluate the prediction; however, we specify that the current evaluation metrics have a critical limitation when evaluating belief states accumulated as the dialogue proceeds, especially in the most used MultiWOZ dataset. Additionally, we propose relative slot accuracy to complement existing metrics. Relative slot accuracy does not depend on the number of predefined slots, and allows intuitive evaluation by assigning relative scores according to the turn of each dialogue. This study also encourages not solely the reporting of joint goal accuracy, but also various complementary metrics in DST tasks for the sake of a realistic evaluation.

* ACL 2022 (short)

Via

Access Paper or Ask Questions

LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

Nov 20, 2021

Yukyung Lee, Jina Kim, Pilsung Kang

Figure 1 for LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

Figure 2 for LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

Figure 3 for LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

Figure 4 for LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

Abstract:The system log generated in a computer system refers to large-scale data that are collected simultaneously and used as the basic data for determining simple errors and detecting external adversarial intrusion or the abnormal behaviors of insiders. The aim of system log anomaly detection is to promptly identify anomalies while minimizing human intervention, which is a critical problem in the industry. Previous studies performed anomaly detection through algorithms after converting various forms of log data into a standardized template using a parser. These methods involved generating a template for refining the log key. Particularly, a template corresponding to a specific event should be defined in advance for all the log data using which the information within the log key may get lost.In this study, we propose LAnoBERT, a parser free system log anomaly detection method that uses the BERT model, exhibiting excellent natural language processing performance. The proposed method, LAnoBERT, learns the model through masked language modeling, which is a BERT-based pre-training method, and proceeds with unsupervised learning-based anomaly detection using the masked language modeling loss function per log key word during the inference process. LAnoBERT achieved better performance compared to previous methodology in an experiment conducted using benchmark log datasets, HDFS, and BGL, and also compared to certain supervised learning-based models.

Via

Access Paper or Ask Questions

Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

Aug 28, 2021

Takyoung Kim, Yukyung Lee, Hoonsang Yoon, Pilsung Kang, Misuk Kim

Figure 1 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

Figure 2 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

Figure 3 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

Figure 4 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

Abstract:The primary purpose of dialogue state tracking (DST), a critical component of an end-to-end conversational system, is to build a model that responds well to real-world situations. Although we often change our minds during ordinary conversations, current benchmark datasets do not adequately reflect such occurrences and instead consist of over-simplified conversations, in which no one changes their mind during a conversation. As the main question inspiring the present study,``Are current benchmark datasets sufficiently diverse to handle casual conversations in which one changes their mind?'' We found that the answer is ``No'' because simply injecting template-based turnback utterances significantly degrades the DST model performance. The test joint goal accuracy on the MultiWOZ decreased by over 5\%p when the simplest form of turnback utterance was injected. Moreover, the performance degeneration worsens when facing more complicated turnback situations. However, we also observed that the performance rebounds when a turnback is appropriately included in the training dataset, implying that the problem is not with the DST models but rather with the construction of the benchmark dataset.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions

Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT

Oct 07, 2020

Youngbin Ro, Yukyung Lee, Pilsung Kang

Figure 1 for Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT

Figure 2 for Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT

Figure 3 for Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT

Figure 4 for Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT

Abstract:In this paper, we propose Multi$^2$OIE, which performs open information extraction (open IE) by combining BERT with multi-head attention. Our model is a sequence-labeling system with an efficient and effective argument extraction method. We use a query, key, and value setting inspired by the Multimodal Transformer to replace the previously used bidirectional long short-term memory architecture with multi-head attention. Multi$^2$OIE outperforms existing sequence-labeling systems with high computational efficiency on two benchmark evaluation datasets, Re-OIE2016 and CaRB. Additionally, we apply the proposed method to multilingual open IE using multilingual BERT. Experimental results on new benchmark datasets introduced for two languages (Spanish and Portuguese) demonstrate that our model outperforms other multilingual systems without training data for the target languages.

* 11 pages, Findings of EMNLP 2020

Via

Access Paper or Ask Questions