Picture for Zhengling Qi

Zhengling Qi

ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning

Add code
Mar 17, 2026
Viaarxiv icon

When Right Meets Wrong: Bilateral Context Conditioning with Reward-Confidence Correction for GRPO

Add code
Mar 13, 2026
Viaarxiv icon

Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making

Add code
Jan 27, 2026
Viaarxiv icon

Beyond Demand Estimation: Consumer Surplus Evaluation via Cumulative Propensity Weights

Add code
Jan 03, 2026
Viaarxiv icon

InSPO: Unlocking Intrinsic Self-Reflection for LLM Preference Optimization

Add code
Dec 30, 2025
Viaarxiv icon

PASTA: A Unified Framework for Offline Assortment Learning

Add code
Oct 02, 2025
Viaarxiv icon

Quantile-Optimal Policy Learning under Unmeasured Confounding

Add code
Jun 08, 2025
Viaarxiv icon

Reinforcement Learning with Continuous Actions Under Unmeasured Confounding

Add code
May 01, 2025
Viaarxiv icon

Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

Add code
Apr 14, 2025
Viaarxiv icon

Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning

Add code
Dec 08, 2024
Figure 1 for Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Figure 2 for Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Figure 3 for Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Figure 4 for Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Viaarxiv icon