Picture for Shenzhi Wang

Shenzhi Wang

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

Add code
Feb 04, 2026
Viaarxiv icon

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Add code
Jan 21, 2026
Viaarxiv icon

OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use

Add code
Aug 06, 2025
Figure 1 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 2 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 3 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Figure 4 for OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Viaarxiv icon

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Add code
May 07, 2025
Figure 1 for Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Figure 2 for Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Figure 3 for Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Figure 4 for Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Viaarxiv icon

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Add code
Apr 07, 2025
Viaarxiv icon

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Add code
Nov 04, 2024
Figure 1 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 2 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 3 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 4 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Viaarxiv icon

LLM-based Optimization of Compound AI Systems: A Survey

Add code
Oct 21, 2024
Figure 1 for LLM-based Optimization of Compound AI Systems: A Survey
Figure 2 for LLM-based Optimization of Compound AI Systems: A Survey
Viaarxiv icon

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Add code
Jul 11, 2024
Figure 1 for Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Figure 2 for Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Figure 3 for Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Figure 4 for Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Viaarxiv icon

DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints

Add code
May 29, 2024
Figure 1 for DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints
Figure 2 for DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints
Figure 3 for DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints
Figure 4 for DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints
Viaarxiv icon

LLM Agents for Psychology: A Study on Gamified Assessments

Add code
Feb 19, 2024
Viaarxiv icon