Picture for Hanyang Zhao

Hanyang Zhao

Fine-Tuning Diffusion Generative Models via Rich Preference Optimization

Add code
Mar 13, 2025
Viaarxiv icon

Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

Add code
Feb 03, 2025
Figure 1 for Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Figure 2 for Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Figure 3 for Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Figure 4 for Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Viaarxiv icon

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Add code
Oct 16, 2024
Figure 1 for WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Figure 2 for WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Figure 3 for WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Figure 4 for WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Viaarxiv icon

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Add code
Oct 05, 2024
Figure 1 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 2 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 3 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 4 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Viaarxiv icon

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Add code
Sep 17, 2024
Viaarxiv icon

Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning

Add code
Sep 12, 2024
Viaarxiv icon

Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions

Add code
May 23, 2024
Figure 1 for Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions
Figure 2 for Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions
Figure 3 for Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions
Figure 4 for Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions
Viaarxiv icon

Score-based Diffusion Models via Stochastic Differential Equations -- a Technical Tutorial

Add code
Feb 12, 2024
Viaarxiv icon

Contractive Diffusion Probabilistic Models

Add code
Jan 23, 2024
Figure 1 for Contractive Diffusion Probabilistic Models
Figure 2 for Contractive Diffusion Probabilistic Models
Figure 3 for Contractive Diffusion Probabilistic Models
Figure 4 for Contractive Diffusion Probabilistic Models
Viaarxiv icon

Policy Optimization for Continuous Reinforcement Learning

Add code
Jun 02, 2023
Figure 1 for Policy Optimization for Continuous Reinforcement Learning
Figure 2 for Policy Optimization for Continuous Reinforcement Learning
Figure 3 for Policy Optimization for Continuous Reinforcement Learning
Figure 4 for Policy Optimization for Continuous Reinforcement Learning
Viaarxiv icon