Picture for Chuheng Zhang

Chuheng Zhang

Policy Filtration in RLHF to Fine-Tune LLM for Code Generation

Add code
Sep 11, 2024
Viaarxiv icon

Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL

Add code
Jul 20, 2024
Viaarxiv icon

Empowering Large Language Models on Robotic Manipulation with Affordance Prompting

Add code
Apr 17, 2024
Figure 1 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Figure 2 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Figure 3 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Figure 4 for Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Viaarxiv icon

ARO: Large Language Model Supervised Robotics Text2Skill Autonomous Learning

Add code
Mar 23, 2024
Viaarxiv icon

Pre-Trained Large Language Models for Industrial Control

Add code
Aug 06, 2023
Viaarxiv icon

A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management

Add code
Jun 13, 2023
Figure 1 for A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management
Figure 2 for A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management
Figure 3 for A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management
Figure 4 for A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management
Viaarxiv icon

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

Add code
Mar 03, 2023
Viaarxiv icon

Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management

Add code
Dec 18, 2022
Figure 1 for Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Figure 2 for Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Figure 3 for Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Figure 4 for Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management
Viaarxiv icon

A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS

Add code
Dec 05, 2022
Viaarxiv icon

TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets

Add code
Dec 05, 2022
Viaarxiv icon