Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyeongu Yun

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Jul 15, 2025

LG AI Research, :, Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Yemuk Choi, Kyubeen Han, Seokhee Hong, Junwon Hwang(+32 more)

Abstract:This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep. To pave the way for the agentic AI era, EXAONE 4.0 incorporates essential features such as agentic tool use, and its multilingual capabilities are extended to support Spanish in addition to English and Korean. The EXAONE 4.0 model series consists of two sizes: a mid-size 32B model optimized for high performance, and a small-size 1.2B model designed for on-device applications. The EXAONE 4.0 demonstrates superior performance compared to open-weight models in its class and remains competitive even against frontier-class models. The models are publicly available for research purposes and can be easily downloaded via https://huggingface.co/LGAI-EXAONE.

* Technical Report, 30 Pages

Via

Access Paper or Ask Questions

EXAONE Deep: Reasoning Enhanced Language Models

Mar 16, 2025

LG AI Research, Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Junwon Hwang, Hyojin Jeon, Kijeong Jeon(+22 more)

Abstract:We present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks. We train our models mainly on the reasoning-specialized dataset that incorporates long streams of thought processes. Evaluation results show that our smaller models, EXAONE Deep 2.4B and 7.8B, outperform other models of comparable size, while the largest model, EXAONE Deep 32B, demonstrates competitive performance against leading open-weight models. All EXAONE Deep models are openly available for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE

* arXiv admin note: substantial text overlap with arXiv:2412.04862, arXiv:2408.03541

Via

Access Paper or Ask Questions

EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Dec 09, 2024

LG AI Research, Soyoung An, Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Seokhee Hong, Junwon Hwang, Hyojin Jeon, Gerrard Jeongwon Jo(+23 more)

Figure 1 for EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Figure 2 for EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Figure 3 for EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Figure 4 for EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Abstract:This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research. The EXAONE 3.5 language models are offered in three configurations: 32B, 7.8B, and 2.4B. These models feature several standout capabilities: 1) exceptional instruction following capabilities in real-world scenarios, achieving the highest scores across seven benchmarks, 2) outstanding long-context comprehension, attaining the top performance in four benchmarks, and 3) competitive results compared to state-of-the-art open models of similar sizes across nine general benchmarks. The EXAONE 3.5 language models are open to anyone for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE. For commercial use, please reach out to the official contact point of LG AI Research: contact_us@lgresearch.ai.

* arXiv admin note: text overlap with arXiv:2408.03541

Via

Access Paper or Ask Questions

EXAONE 3.0 7.8B Instruction Tuned Language Model

Aug 07, 2024

LG AI Research, Soyoung An, Kyunghoon Bae, Eunbi Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Yeonjung Hong, Junwon Hwang, Hyojin Jeon(+28 more)

Figure 1 for EXAONE 3.0 7.8B Instruction Tuned Language Model

Figure 2 for EXAONE 3.0 7.8B Instruction Tuned Language Model

Figure 3 for EXAONE 3.0 7.8B Instruction Tuned Language Model

Figure 4 for EXAONE 3.0 7.8B Instruction Tuned Language Model

Abstract:We introduce EXAONE 3.0 instruction-tuned language model, the first open model in the family of Large Language Models (LLMs) developed by LG AI Research. Among different model sizes, we publicly release the 7.8B instruction-tuned model to promote open research and innovations. Through extensive evaluations across a wide range of public and in-house benchmarks, EXAONE 3.0 demonstrates highly competitive real-world performance with instruction-following capability against other state-of-the-art open models of similar size. Our comparative analysis shows that EXAONE 3.0 excels particularly in Korean, while achieving compelling performance across general tasks and complex reasoning. With its strong real-world effectiveness and bilingual proficiency, we hope that EXAONE keeps contributing to advancements in Expert AI. Our EXAONE 3.0 instruction-tuned model is available at https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct

Via

Access Paper or Ask Questions

ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

Feb 28, 2024

Soyoung Yoon, Eunbi Choi, Jiyeon Kim, Yireun Kim, Hyeongu Yun, Seung-won Hwang

Figure 1 for ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

Figure 2 for ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

Figure 3 for ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

Figure 4 for ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

Abstract:We propose ListT5, a novel reranking approach based on Fusion-in-Decoder (FiD) that handles multiple candidate passages at both train and inference time. We also introduce an efficient inference framework for listwise ranking based on m-ary tournament sort with output caching. We evaluate and compare our model on the BEIR benchmark for zero-shot retrieval task, demonstrating that ListT5 (1) outperforms the state-of-the-art RankT5 baseline with a notable +1.3 gain in the average NDCG@10 score, (2) has an efficiency comparable to pointwise ranking models and surpasses the efficiency of previous listwise ranking models, and (3) overcomes the lost-in-the-middle problem of previous listwise rerankers. Our code, model checkpoints, and the evaluation framework are fully open-sourced at \url{https://github.com/soyoung97/ListT5}.

Via

Access Paper or Ask Questions

PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning

Mar 15, 2023

Yongil Kim, Yerin Hwang, Hyeongu Yun, Seunghyun Yoon, Trung Bui, Kyomin Jung

Figure 1 for PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning

Figure 2 for PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning

Figure 3 for PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning

Figure 4 for PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning

Abstract:Vulnerability to lexical perturbation is a critical weakness of automatic evaluation metrics for image captioning. This paper proposes Perturbation Robust Multi-Lingual CLIPScore(PR-MCS), which exhibits robustness to such perturbations, as a novel reference-free image captioning metric applicable to multiple languages. To achieve perturbation robustness, we fine-tune the text encoder of CLIP with our language-agnostic method to distinguish the perturbed text from the original text. To verify the robustness of PR-MCS, we introduce a new fine-grained evaluation dataset consisting of detailed captions, critical objects, and the relationships between the objects for 3, 000 images in five languages. In our experiments, PR-MCS significantly outperforms baseline metrics in capturing lexical noise of all various perturbation types in all five languages, proving that PR-MCS is highly robust to lexical perturbations.

Via

Access Paper or Ask Questions

In-Context Instruction Learning

Feb 28, 2023

Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo

Figure 1 for In-Context Instruction Learning

Figure 2 for In-Context Instruction Learning

Figure 3 for In-Context Instruction Learning

Figure 4 for In-Context Instruction Learning

Abstract:Instruction learning of Large Language Models (LLMs) has enabled zero-shot task generalization. However, instruction learning has been predominantly approached as a fine-tuning problem, including instruction tuning and reinforcement learning from human feedback, where LLMs are multi-task fine-tuned on various tasks with instructions. In this paper, we present a surprising finding that applying in-context learning to instruction learning, referred to as In-Context Instruction Learning (ICIL), significantly improves the zero-shot task generalization performance for both pretrained and instruction-fine-tuned models. One of the core advantages of ICIL is that it uses a single fixed prompt to evaluate all tasks, which is a concatenation of cross-task demonstrations. In particular, we demonstrate that the most powerful instruction-fine-tuned baseline (text-davinci-003) also benefits from ICIL by 9.3%, indicating that the effect of ICIL is complementary to instruction-based fine-tuning.

* Work In Progress

Via

Access Paper or Ask Questions

Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network

Jan 13, 2017

Seunghyun Yoon, Hyeongu Yun, Yuna Kim, Gyu-tae Park, Kyomin Jung

Figure 1 for Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network

Figure 2 for Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network

Figure 3 for Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network

Figure 4 for Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network

Abstract:In this paper, we propose an efficient transfer leaning methods for training a personalized language model using a recurrent neural network with long short-term memory architecture. With our proposed fast transfer learning schemes, a general language model is updated to a personalized language model with a small amount of user data and a limited computing resource. These methods are especially useful for a mobile device environment while the data is prevented from transferring out of the device for privacy purposes. Through experiments on dialogue data in a drama, it is verified that our transfer learning methods have successfully generated the personalized language model, whose output is more similar to the personal language style in both qualitative and quantitative aspects.

* AAAI workshop on Crowdsourcing, Deep Learning and Artificial Intelligence Agents, Feb 2017, San Francisco CA, USA

Via

Access Paper or Ask Questions