Picture for Jing-Cheng Pang

Jing-Cheng Pang

Reinforcement Learning with Promising Tokens for Large Language Models

Add code
Feb 03, 2026
Viaarxiv icon

EDCO: Dynamic Curriculum Orchestration for Domain-specific Large Language Model Fine-tuning

Add code
Jan 07, 2026
Viaarxiv icon

ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation

Add code
Sep 26, 2025
Viaarxiv icon

ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts

Add code
May 15, 2025
Viaarxiv icon

Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning

Add code
Oct 26, 2024
Figure 1 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Figure 2 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Figure 3 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Figure 4 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Viaarxiv icon

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

Add code
Apr 14, 2024
Figure 1 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Figure 2 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Figure 3 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Figure 4 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Viaarxiv icon

Empowering Language Models with Active Inquiry for Deeper Understanding

Add code
Feb 06, 2024
Viaarxiv icon

Language Model Self-improvement by Reinforcement Learning Contemplation

Add code
May 23, 2023
Figure 1 for Language Model Self-improvement by Reinforcement Learning Contemplation
Figure 2 for Language Model Self-improvement by Reinforcement Learning Contemplation
Figure 3 for Language Model Self-improvement by Reinforcement Learning Contemplation
Figure 4 for Language Model Self-improvement by Reinforcement Learning Contemplation
Viaarxiv icon

Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation

Add code
Feb 18, 2023
Figure 1 for Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation
Figure 2 for Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation
Figure 3 for Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation
Figure 4 for Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation
Viaarxiv icon

Regret Minimization Experience Replay

Add code
Jun 06, 2021
Figure 1 for Regret Minimization Experience Replay
Figure 2 for Regret Minimization Experience Replay
Figure 3 for Regret Minimization Experience Replay
Figure 4 for Regret Minimization Experience Replay
Viaarxiv icon