Picture for Junkang Wu

Junkang Wu

$α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs

Add code
Oct 14, 2024
Viaarxiv icon

$β$-DPO: Direct Preference Optimization with Dynamic $β$

Add code
Jul 11, 2024
Viaarxiv icon

Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization

Add code
Jul 10, 2024
Viaarxiv icon

Direct Multi-Turn Preference Optimization for Language Agents

Add code
Jun 25, 2024
Viaarxiv icon

Lower-Left Partial AUC: An Effective and Efficient Optimization Metric for Recommendation

Add code
Feb 29, 2024
Viaarxiv icon

BSL: Understanding and Improving Softmax Loss for Recommendation

Add code
Dec 20, 2023
Viaarxiv icon

Understanding Contrastive Learning via Distributionally Robust Optimization

Add code
Oct 17, 2023
Viaarxiv icon

On the Theories Behind Hard Negative Sampling for Recommendation

Add code
Feb 19, 2023
Viaarxiv icon

Adap-tau: Adaptively Modulating Embedding Magnitude for Recommendation

Add code
Feb 09, 2023
Viaarxiv icon

FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

Add code
Feb 07, 2023
Viaarxiv icon