Picture for Xiao Ding

Xiao Ding

Precision over Diversity: High-Precision Reward Generalizes to Robust Instruction Following

Add code
Jan 08, 2026
Viaarxiv icon

Exactly or Approximately Wasserstein Distributionally Robust Estimation According to Wasserstein Radii Being Small or Large

Add code
Oct 02, 2025
Viaarxiv icon

Com$^2$: A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models

Add code
Jun 08, 2025
Viaarxiv icon

CrossICL: Cross-Task In-Context Learning via Unsupervised Demonstration Transfer

Add code
May 30, 2025
Viaarxiv icon

ExpeTrans: LLMs Are Experiential Transfer Learners

Add code
May 29, 2025
Viaarxiv icon

Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning

Add code
May 27, 2025
Viaarxiv icon

Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing

Add code
May 22, 2025
Viaarxiv icon

UFO-RL: Uncertainty-Focused Optimization for Efficient Reinforcement Learning Data Selection

Add code
May 18, 2025
Viaarxiv icon

Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models

Add code
Apr 17, 2025
Viaarxiv icon

Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning

Add code
Jan 15, 2025
Figure 1 for Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning
Figure 2 for Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning
Figure 3 for Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning
Figure 4 for Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning
Viaarxiv icon