Picture for Ruizhe Shi

Ruizhe Shi

The Crucial Role of Samplers in Online Direct Preference Optimization

Add code
Sep 29, 2024
Figure 1 for The Crucial Role of Samplers in Online Direct Preference Optimization
Figure 2 for The Crucial Role of Samplers in Online Direct Preference Optimization
Figure 3 for The Crucial Role of Samplers in Online Direct Preference Optimization
Figure 4 for The Crucial Role of Samplers in Online Direct Preference Optimization
Viaarxiv icon

Decoding-Time Language Model Alignment with Multiple Objectives

Add code
Jun 27, 2024
Figure 1 for Decoding-Time Language Model Alignment with Multiple Objectives
Figure 2 for Decoding-Time Language Model Alignment with Multiple Objectives
Figure 3 for Decoding-Time Language Model Alignment with Multiple Objectives
Figure 4 for Decoding-Time Language Model Alignment with Multiple Objectives
Viaarxiv icon

Rethinking Transformers in Solving POMDPs

Add code
May 30, 2024
Figure 1 for Rethinking Transformers in Solving POMDPs
Figure 2 for Rethinking Transformers in Solving POMDPs
Figure 3 for Rethinking Transformers in Solving POMDPs
Figure 4 for Rethinking Transformers in Solving POMDPs
Viaarxiv icon

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Add code
Nov 07, 2023
Viaarxiv icon

H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation

Add code
Oct 13, 2023
Viaarxiv icon

A Novel Gradient Descent Least Squares (GDLS) Algorithm for SMV Gridless Line Spectrum Estimation with Efficiency

Add code
Mar 16, 2022
Figure 1 for A Novel Gradient Descent Least Squares (GDLS) Algorithm for SMV Gridless Line Spectrum Estimation with Efficiency
Figure 2 for A Novel Gradient Descent Least Squares (GDLS) Algorithm for SMV Gridless Line Spectrum Estimation with Efficiency
Figure 3 for A Novel Gradient Descent Least Squares (GDLS) Algorithm for SMV Gridless Line Spectrum Estimation with Efficiency
Figure 4 for A Novel Gradient Descent Least Squares (GDLS) Algorithm for SMV Gridless Line Spectrum Estimation with Efficiency
Viaarxiv icon