Picture for Liyu Chen

Liyu Chen

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs

Add code
Oct 10, 2024
Figure 1 for Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Figure 2 for Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Figure 3 for Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Figure 4 for Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Viaarxiv icon

BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data

Add code
Oct 01, 2024
Viaarxiv icon

Effective Diffusion Transformer Architecture for Image Super-Resolution

Add code
Sep 29, 2024
Viaarxiv icon

Collaboration of Teachers for Semi-supervised Object Detection

Add code
May 22, 2024
Viaarxiv icon

$\mathbf{}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model

Add code
Mar 11, 2024
Figure 1 for $\mathbf{}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Figure 2 for $\mathbf{}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Figure 3 for $\mathbf{}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Figure 4 for $\mathbf{}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Viaarxiv icon

$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis

Add code
Oct 04, 2023
Viaarxiv icon

Layered State Discovery for Incremental Autonomous Exploration

Add code
Feb 07, 2023
Viaarxiv icon

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path

Add code
Oct 10, 2022
Figure 1 for Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Figure 2 for Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Figure 3 for Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Viaarxiv icon

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

Add code
May 26, 2022
Figure 1 for Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Viaarxiv icon

Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments

Add code
May 25, 2022
Viaarxiv icon