Picture for Zhiqiang Zhang

Zhiqiang Zhang

GNNVerifier: Graph-based Verifier for LLM Task Planning

Add code
Mar 17, 2026
Viaarxiv icon

Optimal Expert-Attention Allocation in Mixture-of-Experts: A Scalable Law for Dynamic Model Design

Add code
Mar 11, 2026
Viaarxiv icon

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Add code
Feb 09, 2026
Viaarxiv icon

When Sharpening Becomes Collapse: Sampling Bias and Semantic Coupling in RL with Verifiable Rewards

Add code
Jan 26, 2026
Viaarxiv icon

Token-level Collaborative Alignment for LLM-based Generative Recommendation

Add code
Jan 26, 2026
Viaarxiv icon

MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging

Add code
Jan 25, 2026
Viaarxiv icon

Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards

Add code
Dec 25, 2025
Viaarxiv icon

AesTest: Measuring Aesthetic Intelligence from Perception to Production

Add code
Nov 09, 2025
Viaarxiv icon

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Add code
Oct 21, 2025
Figure 1 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 2 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 3 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 4 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Viaarxiv icon

Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness

Add code
Aug 26, 2025
Viaarxiv icon