Picture for Wenzhe Li

Wenzhe Li

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations

Add code
Feb 10, 2025
Viaarxiv icon

Towards Principled Superhuman AI for Multiplayer Symmetric Games

Add code
Jun 06, 2024
Viaarxiv icon

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Add code
Jun 04, 2024
Viaarxiv icon

A Survey on Transformers in Reinforcement Learning

Add code
Jan 08, 2023
Viaarxiv icon

Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery

Add code
Dec 02, 2022
Figure 1 for Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Figure 2 for Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Figure 3 for Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Figure 4 for Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Viaarxiv icon

Improving Graph-Based Text Representations with Character and Word Level N-grams

Add code
Oct 12, 2022
Figure 1 for Improving Graph-Based Text Representations with Character and Word Level N-grams
Figure 2 for Improving Graph-Based Text Representations with Character and Word Level N-grams
Figure 3 for Improving Graph-Based Text Representations with Character and Word Level N-grams
Figure 4 for Improving Graph-Based Text Representations with Character and Word Level N-grams
Viaarxiv icon

Latent-Variable Advantage-Weighted Policy Optimization for Offline RL

Add code
Mar 16, 2022
Figure 1 for Latent-Variable Advantage-Weighted Policy Optimization for Offline RL
Figure 2 for Latent-Variable Advantage-Weighted Policy Optimization for Offline RL
Figure 3 for Latent-Variable Advantage-Weighted Policy Optimization for Offline RL
Figure 4 for Latent-Variable Advantage-Weighted Policy Optimization for Offline RL
Viaarxiv icon

Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL

Add code
Feb 14, 2022
Figure 1 for Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Figure 2 for Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Figure 3 for Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Figure 4 for Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Viaarxiv icon

Estimating High Order Gradients of the Data Distribution by Denoising

Add code
Nov 08, 2021
Figure 1 for Estimating High Order Gradients of the Data Distribution by Denoising
Figure 2 for Estimating High Order Gradients of the Data Distribution by Denoising
Figure 3 for Estimating High Order Gradients of the Data Distribution by Denoising
Figure 4 for Estimating High Order Gradients of the Data Distribution by Denoising
Viaarxiv icon

Offline Reinforcement Learning with Reverse Model-based Imagination

Add code
Oct 01, 2021
Figure 1 for Offline Reinforcement Learning with Reverse Model-based Imagination
Figure 2 for Offline Reinforcement Learning with Reverse Model-based Imagination
Figure 3 for Offline Reinforcement Learning with Reverse Model-based Imagination
Figure 4 for Offline Reinforcement Learning with Reverse Model-based Imagination
Viaarxiv icon