Picture for Cheems Wang

Cheems Wang

Limited Reasoning Space: The cage of long-horizon reasoning in LLMs

Add code
Feb 22, 2026
Viaarxiv icon

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search

Add code
Sep 19, 2025
Viaarxiv icon

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Add code
Dec 15, 2024
Figure 1 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Figure 2 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Figure 3 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Figure 4 for Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
Viaarxiv icon

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration

Add code
Oct 03, 2024
Figure 1 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Figure 2 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Figure 3 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Figure 4 for Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Viaarxiv icon

Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation

Add code
Jul 28, 2024
Figure 1 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Figure 2 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Figure 3 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Figure 4 for Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
Viaarxiv icon

Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation

Add code
Jun 24, 2024
Figure 1 for Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Figure 2 for Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Figure 3 for Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Figure 4 for Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Viaarxiv icon

GO4Align: Group Optimization for Multi-Task Alignment

Add code
Apr 09, 2024
Figure 1 for GO4Align: Group Optimization for Multi-Task Alignment
Figure 2 for GO4Align: Group Optimization for Multi-Task Alignment
Figure 3 for GO4Align: Group Optimization for Multi-Task Alignment
Figure 4 for GO4Align: Group Optimization for Multi-Task Alignment
Viaarxiv icon