Picture for Daiki E. Matsunaga

Daiki E. Matsunaga

GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets

Add code
Oct 19, 2024
Figure 1 for GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
Figure 2 for GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
Figure 3 for GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
Figure 4 for GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
Viaarxiv icon

Stitching Sub-Trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL

Add code
Feb 11, 2024
Viaarxiv icon

AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation

Add code
Nov 03, 2023
Viaarxiv icon