Picture for Yifu Huo

Yifu Huo

LRHP: Learning Representations for Human Preferences via Preference Pairs

Add code
Oct 06, 2024
Viaarxiv icon

RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data

Add code
Aug 22, 2024
Figure 1 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 2 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 3 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 4 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Viaarxiv icon

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

Add code
Aug 04, 2023
Figure 1 for ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation
Figure 2 for ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation
Figure 3 for ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation
Figure 4 for ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation
Viaarxiv icon