Picture for Zheng Wu

Zheng Wu

Flaming-hot Initiation with Regular Execution Sampling for Large Language Models

Add code
Oct 28, 2024
Figure 1 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 2 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 3 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Figure 4 for Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Viaarxiv icon

Process Supervision-Guided Policy Optimization for Code Generation

Add code
Oct 23, 2024
Figure 1 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 2 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 3 for Process Supervision-Guided Policy Optimization for Code Generation
Figure 4 for Process Supervision-Guided Policy Optimization for Code Generation
Viaarxiv icon

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

Add code
Oct 11, 2024
Viaarxiv icon

Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning

Add code
Apr 07, 2024
Viaarxiv icon

DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking

Add code
Mar 25, 2024
Viaarxiv icon

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Dec 06, 2023
Viaarxiv icon

Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning

Add code
Oct 16, 2023
Viaarxiv icon

Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward

Add code
Dec 03, 2022
Figure 1 for Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Figure 2 for Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Figure 3 for Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Figure 4 for Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Viaarxiv icon

Prim-LAfD: A Framework to Learn and Adapt Primitive-Based Skills from Demonstrations for Insertion Tasks

Add code
Dec 02, 2022
Figure 1 for Prim-LAfD: A Framework to Learn and Adapt Primitive-Based Skills from Demonstrations for Insertion Tasks
Figure 2 for Prim-LAfD: A Framework to Learn and Adapt Primitive-Based Skills from Demonstrations for Insertion Tasks
Figure 3 for Prim-LAfD: A Framework to Learn and Adapt Primitive-Based Skills from Demonstrations for Insertion Tasks
Figure 4 for Prim-LAfD: A Framework to Learn and Adapt Primitive-Based Skills from Demonstrations for Insertion Tasks
Viaarxiv icon

Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning

Add code
Oct 01, 2022
Figure 1 for Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning
Figure 2 for Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning
Figure 3 for Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning
Figure 4 for Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning
Viaarxiv icon