Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yufei Tao

CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences

Nov 10, 2025

Rhitabrat Pokharel, Yufei Tao, Ameeta Agrawal

Figure 1 for CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences

Figure 2 for CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences

Figure 3 for CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences

Figure 4 for CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences

Abstract:Preference optimization is a critical post-training technique used to align large language models (LLMs) with human preferences, typically by fine-tuning on ranked response pairs. While methods like Direct Preference Optimization (DPO) have proven effective in English, they often fail to generalize robustly to multilingual settings. We propose a simple yet effective alternative, Confidence-Aware Preference Optimization (CAPO), which replaces DPO's fixed treatment of preference pairs with a dynamic loss scaling mechanism based on a relative reward. By modulating the learning signal according to the confidence in each preference pair, CAPO enhances robustness to noisy or low-margin comparisons, typically encountered in multilingual text. Empirically, CAPO outperforms existing preference optimization baselines by at least 16% in reward accuracy, and improves alignment by widening the gap between preferred and dispreferred responses across languages.

* Accepted at IJCNLP-AACL 2025 Findings

Via

Access Paper or Ask Questions

Monotone Classification with Relative Approximations

Jun 12, 2025

Yufei Tao

Figure 1 for Monotone Classification with Relative Approximations

Figure 2 for Monotone Classification with Relative Approximations

Figure 3 for Monotone Classification with Relative Approximations

Figure 4 for Monotone Classification with Relative Approximations

Abstract:In monotone classification, the input is a multi-set $P$ of points in $\mathbb{R}^d$, each associated with a hidden label from $\{-1, 1\}$. The goal is to identify a monotone function $h$, which acts as a classifier, mapping from $\mathbb{R}^d$ to $\{-1, 1\}$ with a small {\em error}, measured as the number of points $p \in P$ whose labels differ from the function values $h(p)$. The cost of an algorithm is defined as the number of points having their labels revealed. This article presents the first study on the lowest cost required to find a monotone classifier whose error is at most $(1 + \epsilon) \cdot k^*$ where $\epsilon \ge 0$ and $k^*$ is the minimum error achieved by an optimal monotone classifier -- in other words, the error is allowed to exceed the optimal by at most a relative factor. Nearly matching upper and lower bounds are presented for the full range of $\epsilon$. All previous work on the problem can only achieve an error higher than the optimal by an absolute factor.

Via

Access Paper or Ask Questions

When Context Leads but Parametric Memory Follows in Large Language Models

Sep 13, 2024

Yufei Tao, Adam Hiatt, Erik Haake, Antonie J. Jetter, Ameeta Agrawal

Figure 1 for When Context Leads but Parametric Memory Follows in Large Language Models

Figure 2 for When Context Leads but Parametric Memory Follows in Large Language Models

Figure 3 for When Context Leads but Parametric Memory Follows in Large Language Models

Figure 4 for When Context Leads but Parametric Memory Follows in Large Language Models

Abstract:Large language models (LLMs) have demonstrated remarkable progress in leveraging diverse knowledge sources. This study investigates how nine widely used LLMs allocate knowledge between local context and global parameters when answering open-ended questions in knowledge-consistent scenarios. We introduce a novel dataset, WikiAtomic, and systematically vary context sizes to analyze how LLMs prioritize and utilize the provided information and their parametric knowledge in knowledge-consistent scenarios. Additionally, we also study their tendency to hallucinate under varying context sizes. Our findings reveal consistent patterns across models, including a consistent reliance on both contextual (around 70%) and parametric (around 30%) knowledge, and a decrease in hallucinations with increasing context. These insights highlight the importance of more effective context organization and developing models that use input more deterministically for robust performance.

Via

Access Paper or Ask Questions

ChatGPT Role-play Dataset: Analysis of User Motives and Model Naturalness

Mar 26, 2024

Yufei Tao, Ameeta Agrawal, Judit Dombi, Tetyana Sydorenko, Jung In Lee

Abstract:Recent advances in interactive large language models like ChatGPT have revolutionized various domains; however, their behavior in natural and role-play conversation settings remains underexplored. In our study, we address this gap by deeply investigating how ChatGPT behaves during conversations in different settings by analyzing its interactions in both a normal way and a role-play setting. We introduce a novel dataset of broad range of human-AI conversations annotated with user motives and model naturalness to examine (i) how humans engage with the conversational AI model, and (ii) how natural are AI model responses. Our study highlights the diversity of user motives when interacting with ChatGPT and variable AI naturalness, showing not only the nuanced dynamics of natural conversations between humans and AI, but also providing new avenues for improving the effectiveness of human-AI communication.

* Accepted by LREC-COLING 2024

Via

Access Paper or Ask Questions

Making a Long Story Short in Conversation Modeling

Jan 31, 2024

Yufei Tao, Tiernan Mines, Ameeta Agrawal

Figure 1 for Making a Long Story Short in Conversation Modeling

Figure 2 for Making a Long Story Short in Conversation Modeling

Figure 3 for Making a Long Story Short in Conversation Modeling

Figure 4 for Making a Long Story Short in Conversation Modeling

Abstract:Conversation systems accommodate diverse users with unique personalities and distinct writing styles. Within the domain of multi-turn dialogue modeling, this work studies the impact of varied utterance lengths on the quality of subsequent responses generated by conversation models. Using GPT-3 as the base model, multiple dialogue datasets, and several metrics, we conduct a thorough exploration of this aspect of conversational models. Our analysis sheds light on the complex relationship between utterance lengths and the quality of follow-up responses generated by dialogue systems. Empirical findings suggests that, for certain types of conversations, utterance lengths can be reduced by up to 72% without any noticeable difference in the quality of follow-up responses.

* This paper was accepted by TEICAI workshop at EACL 2024

Via

Access Paper or Ask Questions