Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seokhwan Jo

A.X K1 Technical Report

Jan 15, 2026

Sung Jun Cheon, Jaekyung Cho, Seongho Choi, Hyunjun Eun, Seokhwan Jo, Jaehyun Jun, Minsoo Kang, Jin Kim, Jiwon Kim, Minsang Kim(+50 more)

Abstract:We introduce A.X K1, a 519B-parameter Mixture-of-Experts (MoE) language model trained from scratch. Our design leverages scaling laws to optimize training configurations and vocabulary size under fixed computational budgets. A.X K1 is pre-trained on a corpus of approximately 10T tokens, curated by a multi-stage data processing pipeline. Designed to bridge the gap between reasoning capability and inference efficiency, A.X K1 supports explicitly controllable reasoning to facilitate scalable deployment across diverse real-world scenarios. We propose a simple yet effective Think-Fusion training recipe, enabling user-controlled switching between thinking and non-thinking modes within a single unified model. Extensive evaluations demonstrate that A.X K1 achieves performance competitive with leading open-source models, while establishing a distinctive advantage in Korean-language benchmarks.

* This paper is withdrawn pending additional internal review of the methodology and analysis

Via

Access Paper or Ask Questions

SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

Oct 06, 2020

Hwaran Lee, Seokhwan Jo, HyungJun Kim, Sangkeun Jung, Tae-Yoon Kim

Figure 1 for SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

Figure 2 for SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

Figure 3 for SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

Figure 4 for SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning

Abstract:The recent advent of neural approaches for developing each dialog component in task-oriented dialog systems has remarkably improved, yet optimizing the overall system performance remains a challenge. In this paper, we propose an end-to-end trainable neural dialog system with reinforcement learning, named SUMBT+LaRL. The SUMBT+ estimates user-acts as well as dialog belief states, and the LaRL models latent system action spaces and generates responses given the estimated contexts. We experimentally demonstrate that the training framework in which the SUMBT+ and LaRL are separately pretrained and then the entire system is fine-tuned significantly increases dialog success rates. We propose new success criteria for reinforcement learning to the end-to-end dialog system as well as provide experimental analysis on a different result aspect depending on the success criteria and evaluation methods. Consequently, our model achieved the new state-of-the-art success rate of 85.4% on corpus-based evaluation, and a comparable success rate of 81.40% on simulator-based evaluation provided by the DSTC8 challenge.

* 13 pages, 5 figures. This work has been submitted to the IEEE/ACM Transactions on Audio Speech and Language Processing for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions