Picture for Mao Zheng

Mao Zheng

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Add code
Apr 02, 2026
Viaarxiv icon

PRISM: Probability Reallocation with In-Span Masking for Knowledge-Sensitive Alignment

Add code
Apr 02, 2026
Viaarxiv icon

A Survey of On-Policy Distillation for Large Language Models

Add code
Apr 01, 2026
Viaarxiv icon

Beyond the Illusion of Consensus: From Surface Heuristics to Knowledge-Grounded Evaluation in LLM-as-a-Judge

Add code
Mar 11, 2026
Viaarxiv icon

Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions

Add code
Mar 10, 2026
Viaarxiv icon

CodeDelegator: Mitigating Context Pollution via Role Separation in Code-as-Action Agents

Add code
Jan 21, 2026
Viaarxiv icon

PodBench: A Comprehensive Benchmark for Instruction-Aware Audio-Oriented Podcast Script Generation

Add code
Jan 21, 2026
Viaarxiv icon

HY-MT1.5 Technical Report

Add code
Dec 30, 2025
Viaarxiv icon

Hunyuan-MT Technical Report

Add code
Sep 05, 2025
Figure 1 for Hunyuan-MT Technical Report
Figure 2 for Hunyuan-MT Technical Report
Figure 3 for Hunyuan-MT Technical Report
Figure 4 for Hunyuan-MT Technical Report
Viaarxiv icon

Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning

Add code
May 27, 2025
Figure 1 for Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning
Figure 2 for Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning
Figure 3 for Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning
Figure 4 for Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning
Viaarxiv icon