Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jimin Hong

PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

Jun 18, 2024

Hawon Jeong, ChaeHun Park, Jimin Hong, Jaegul Choo

Abstract:Pairwise evaluation using large language models (LLMs) is widely used for evaluating natural language generation (NLG) tasks. However, the reliability of LLMs is often compromised by biases, such as favoring verbosity and authoritative tone. In the study, we focus on the comparison of two LLM-based evaluation approaches, pointwise and pairwise. Our findings demonstrate that pointwise evaluators exhibit more robustness against undesirable preferences. Further analysis reveals that pairwise evaluators can accurately identify the shortcomings of low-quality outputs even when their judgment is incorrect. These results indicate that LLMs are more severely influenced by their bias in a pairwise evaluation setup. To mitigate this, we propose a hybrid method that integrates pointwise reasoning into pairwise evaluation. Experimental results show that our method enhances the robustness of pairwise evaluators against adversarial samples while preserving accuracy on normal samples.

Via

Access Paper or Ask Questions

A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Jan 19, 2024

Jimin Hong, Gibbeum Lee, Jaewoong Cho

Figure 1 for A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Figure 2 for A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Figure 3 for A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Figure 4 for A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Abstract:Recent advancements in large language models have facilitated the execution of complex language tasks, not only in English but also in non-English languages. However, the tokenizers of most language models, such as Llama, trained on English-centric corpora, tend to excessively fragment tokens in non-English languages. This issue is especially pronounced in non-roman alphabetic languages, which are often divided at a character or even Unicode level, leading to slower text generation. To address this, our study introduces a novel framework designed to expedite text generation in these languages. This framework predicts larger linguistic units than those of conventional multilingual tokenizers and is specifically tailored to the target language, thereby reducing the number of decoding steps required. Our empirical results demonstrate that the proposed framework increases the generation speed by a factor of 1.9 compared to standard decoding while maintaining the performance of a pre-trained multilingual model on monolingual tasks.

Via

Access Paper or Ask Questions

Learning to Diversify Neural Text Generation via Degenerative Model

Sep 22, 2023

Jimin Hong, ChaeHun Park, Jaegul Choo

Figure 1 for Learning to Diversify Neural Text Generation via Degenerative Model

Figure 2 for Learning to Diversify Neural Text Generation via Degenerative Model

Figure 3 for Learning to Diversify Neural Text Generation via Degenerative Model

Figure 4 for Learning to Diversify Neural Text Generation via Degenerative Model

Abstract:Neural language models often fail to generate diverse and informative texts, limiting their applicability in real-world problems. While previous approaches have proposed to address these issues by identifying and penalizing undesirable behaviors (e.g., repetition, overuse of frequent words) from language models, we propose an alternative approach based on an observation: models primarily learn attributes within examples that are likely to cause degeneration problems. Based on this observation, we propose a new approach to prevent degeneration problems by training two models. Specifically, we first train a model that is designed to amplify undesirable patterns. We then enhance the diversity of the second model by focusing on patterns that the first model fails to learn. Extensive experiments on two tasks, namely language modeling and dialogue generation, demonstrate the effectiveness of our approach.

* IJCNLP-AACL2023 Findings, 10 pages

Via

Access Paper or Ask Questions

TeSS: Zero-Shot Classification via Textual Similarity Comparison with Prompting using Sentence Encoder

Dec 20, 2022

Jimin Hong, Jungsoo Park, Daeyoung Kim, Seongjae Choi, Bokyung Son, Jaewook Kang

Abstract:We introduce TeSS (Text Similarity Comparison using Sentence Encoder), a framework for zero-shot classification where the assigned label is determined by the embedding similarity between the input text and each candidate label prompt. We leverage representations from sentence encoders optimized to locate semantically similar samples closer to each other in embedding space during pre-training. The label prompt embeddings serve as prototypes of their corresponding class clusters. Furthermore, to compensate for the potentially poorly descriptive labels in their original format, we retrieve semantically similar sentences from external corpora and additionally use them with the original label prompt (TeSS-R). TeSS outperforms strong baselines on various closed-set and open-set classification datasets under zero-shot setting, with further gains when combined with label prompt diversification through retrieval. These results are robustly attained to verbalizer variations, an ancillary benefit of using a bi-encoder. Altogether, our method serves as a reliable baseline for zero-shot classification and a simple interface to assess the quality of sentence encoders.

* 9 pages, 3 figures

Via

Access Paper or Ask Questions

Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity

Aug 30, 2022

Taehee Kim, ChaeHun Park, Jimin Hong, Radhika Dua, Edward Choi, Jaegul Choo

Figure 1 for Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity

Figure 2 for Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity

Figure 3 for Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity

Figure 4 for Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity

Abstract:Semantically meaningful sentence embeddings are important for numerous tasks in natural language processing. To obtain such embeddings, recent studies explored the idea of utilizing synthetically generated data from pretrained language models (PLMs) as a training corpus. However, PLMs often generate sentences much different from the ones written by human. We hypothesize that treating all these synthetic examples equally for training deep neural networks can have an adverse effect on learning semantically meaningful embeddings. To analyze this, we first train a classifier that identifies machine-written sentences, and observe that the linguistic features of the sentences identified as written by a machine are significantly different from those of human-written sentences. Based on this, we propose a novel approach that first trains the classifier to measure the importance of each sentence. The distilled information from the classifier is then used to train a reliable sentence embedding model. Through extensive evaluation on four real-world datasets, we demonstrate that our model trained on synthetic data generalizes well and outperforms the existing baselines. Our implementation is publicly available at https://github.com/ddehun/coling2022_reweighting_sts.

* COLING2022

Via

Access Paper or Ask Questions

AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

Oct 26, 2021

Jimin Hong, Taehee Kim, Hyesu Lim, Jaegul Choo

Figure 1 for AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

Figure 2 for AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

Figure 3 for AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

Figure 4 for AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

Abstract:During the fine-tuning phase of transfer learning, the pretrained vocabulary remains unchanged, while model parameters are updated. The vocabulary generated based on the pretrained data is suboptimal for downstream data when domain discrepancy exists. We propose to consider the vocabulary as an optimizable parameter, allowing us to update the vocabulary by expanding it with domain-specific vocabulary based on a tokenization statistic. Furthermore, we preserve the embeddings of the added words from overfitting to downstream data by utilizing knowledge learned from a pretrained language model with a regularization term. Our method achieved consistent performance improvements on diverse domains (i.e., biomedical, computer science, news, and reviews).

* EMNLP2021 Accepted

Via

Access Paper or Ask Questions

Natural Attribute-based Shift Detection

Oct 18, 2021

Jeonghoon Park, Jimin Hong, Radhika Dua, Daehoon Gwak, Yixuan Li, Jaegul Choo, Edward Choi

Figure 1 for Natural Attribute-based Shift Detection

Figure 2 for Natural Attribute-based Shift Detection

Figure 3 for Natural Attribute-based Shift Detection

Figure 4 for Natural Attribute-based Shift Detection

Abstract:Despite the impressive performance of deep networks in vision, language, and healthcare, unpredictable behaviors on samples from the distribution different than the training distribution cause severe problems in deployment. For better reliability of neural-network-based classifiers, we define a new task, natural attribute-based shift (NAS) detection, to detect the samples shifted from the training distribution by some natural attribute such as age of subjects or brightness of images. Using the natural attributes present in existing datasets, we introduce benchmark datasets in vision, language, and medical for NAS detection. Further, we conduct an extensive evaluation of prior representative out-of-distribution (OOD) detection methods on NAS datasets and observe an inconsistency in their performance. To understand this, we provide an analysis on the relationship between the location of NAS samples in the feature space and the performance of distance- and confidence-based OOD detection methods. Based on the analysis, we split NAS samples into three categories and further suggest a simple modification to the training objective to obtain an improved OOD detection method that is capable of detecting samples from all NAS categories.

Via

Access Paper or Ask Questions

Evaluation of Out-of-Distribution Detection Performance of Self-Supervised Learning in a Controllable Environment

Nov 26, 2020

Jeonghoon Park, Kyungmin Jo, Daehoon Gwak, Jimin Hong, Jaegul Choo, Edward Choi

Figure 1 for Evaluation of Out-of-Distribution Detection Performance of Self-Supervised Learning in a Controllable Environment

Figure 2 for Evaluation of Out-of-Distribution Detection Performance of Self-Supervised Learning in a Controllable Environment

Figure 3 for Evaluation of Out-of-Distribution Detection Performance of Self-Supervised Learning in a Controllable Environment

Figure 4 for Evaluation of Out-of-Distribution Detection Performance of Self-Supervised Learning in a Controllable Environment

Abstract:We evaluate the out-of-distribution (OOD) detection performance of self-supervised learning (SSL) techniques with a new evaluation framework. Unlike the previous evaluation methods, the proposed framework adjusts the distance of OOD samples from the in-distribution samples. We evaluate an extensive combination of OOD detection algorithms on three different implementations of the proposed framework using simulated samples, images, and text. SSL methods consistently demonstrated the improved OOD detection performance in all evaluation settings.

Via

Access Paper or Ask Questions

F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Oct 04, 2020

Byung-Ju Choi, Jimin Hong, David Keetae Park, Sang Wan Lee

Figure 1 for F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Figure 2 for F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Figure 3 for F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Figure 4 for F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Abstract:Despite recent advances in neural text generation, encoding the rich diversity in human language remains elusive. We argue that the sub-optimal text generation is mainly attributable to the imbalanced token distribution, which particularly misdirects the learning model when trained with the maximum-likelihood objective. As a simple yet effective remedy, we propose two novel methods, F^2-Softmax and MefMax, for a balanced training even with the skewed frequency distribution. MefMax assigns tokens uniquely to frequency classes, trying to group tokens with similar frequencies and equalize frequency mass between the classes. F^2-Softmax then decomposes a probability distribution of the target token into a product of two conditional probabilities of (i) frequency class, and (ii) token from the target frequency class. Models learn more uniform probability distributions because they are confined to subsets of vocabularies. Significant performance gains on seven relevant metrics suggest the supremacy of our approach in improving not only the diversity but also the quality of generated texts.

* EMNLP 2020

Via

Access Paper or Ask Questions