Picture for Zhengyi Yang

Zhengyi Yang

$α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs

Add code
Oct 14, 2024
Figure 1 for $α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs
Figure 2 for $α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs
Figure 3 for $α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs
Figure 4 for $α$-DPO: Adaptive Reward Margin is What Direct Preference Optimization Needs
Viaarxiv icon

$β$-DPO: Direct Preference Optimization with Dynamic $β$

Add code
Jul 11, 2024
Viaarxiv icon

Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization

Add code
Jul 10, 2024
Figure 1 for Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Figure 2 for Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Figure 3 for Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Figure 4 for Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Viaarxiv icon

On Softmax Direct Preference Optimization for Recommendation

Add code
Jun 14, 2024
Figure 1 for On Softmax Direct Preference Optimization for Recommendation
Figure 2 for On Softmax Direct Preference Optimization for Recommendation
Figure 3 for On Softmax Direct Preference Optimization for Recommendation
Figure 4 for On Softmax Direct Preference Optimization for Recommendation
Viaarxiv icon

Item-side Fairness of Large Language Model-based Recommendation System

Add code
Feb 23, 2024
Viaarxiv icon

MolTC: Towards Molecular Relational Modeling In Language Models

Add code
Feb 14, 2024
Viaarxiv icon

LLaRA: Aligning Large Language Models with Sequential Recommenders

Add code
Dec 05, 2023
Viaarxiv icon

Large Language Model Can Interpret Latent Space of Sequential Recommender

Add code
Oct 31, 2023
Viaarxiv icon

Generate What You Prefer: Reshaping Sequential Recommendation via Guided Diffusion

Add code
Oct 31, 2023
Viaarxiv icon

Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation

Add code
Oct 25, 2023
Viaarxiv icon