Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Mar 16, 2025

Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, Sanjay Lall

Figure 1 for One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Figure 2 for One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Figure 3 for One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Figure 4 for One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Share this with someone who'll enjoy it:

Abstract:Large Language Models (LLMs) have made significant strides in generating human-like responses, largely due to preference alignment techniques. However, these methods often assume unbiased human feedback, which is rarely the case in real-world scenarios. This paper introduces Content-Aware Noise-Resilient Preference Optimization (CNRPO), a novel framework that addresses multiple sources of content-dependent noise in preference learning. CNRPO employs a multi-objective optimization approach to separate true preferences from content-aware noises, effectively mitigating their impact. We leverage backdoor attack mechanisms to efficiently learn and control various noise sources within a single model. Theoretical analysis and extensive experiments on different synthetic noisy datasets demonstrate that CNRPO significantly improves alignment with primary human preferences while controlling for secondary noises and biases, such as response length and harmfulness.

View paper on

Share this with someone who'll enjoy it:

Title:One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Paper and Code