Picture for Jingqing Ruan

Jingqing Ruan

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Add code
Mar 17, 2026
Viaarxiv icon

Harmonizing Dense and Sparse Signals in Multi-turn RL: Dual-Horizon Credit Assignment for Industrial Sales Agents

Add code
Mar 02, 2026
Viaarxiv icon

Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy

Add code
Jul 02, 2025
Figure 1 for Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy
Figure 2 for Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy
Figure 3 for Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy
Figure 4 for Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy
Viaarxiv icon

AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models

Add code
Jun 08, 2025
Figure 1 for AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Figure 2 for AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Figure 3 for AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Figure 4 for AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Viaarxiv icon

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning

Add code
May 21, 2025
Figure 1 for When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Figure 2 for When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Figure 3 for When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Figure 4 for When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Viaarxiv icon

QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning

Add code
Aug 20, 2024
Viaarxiv icon

GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control Agents

Add code
Jul 15, 2024
Figure 1 for GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control Agents
Figure 2 for GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control Agents
Figure 3 for GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control Agents
Figure 4 for GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control Agents
Viaarxiv icon

CoSLight: Co-optimizing Collaborator Selection and Decision-making to Enhance Traffic Signal Control

Add code
May 27, 2024
Figure 1 for CoSLight: Co-optimizing Collaborator Selection and Decision-making to Enhance Traffic Signal Control
Figure 2 for CoSLight: Co-optimizing Collaborator Selection and Decision-making to Enhance Traffic Signal Control
Figure 3 for CoSLight: Co-optimizing Collaborator Selection and Decision-making to Enhance Traffic Signal Control
Figure 4 for CoSLight: Co-optimizing Collaborator Selection and Decision-making to Enhance Traffic Signal Control
Viaarxiv icon

Hummer: Towards Limited Competitive Preference Dataset

Add code
May 21, 2024
Figure 1 for Hummer: Towards Limited Competitive Preference Dataset
Figure 2 for Hummer: Towards Limited Competitive Preference Dataset
Figure 3 for Hummer: Towards Limited Competitive Preference Dataset
Figure 4 for Hummer: Towards Limited Competitive Preference Dataset
Viaarxiv icon

Learning Causal Dynamics Models in Object-Oriented Environments

Add code
May 21, 2024
Figure 1 for Learning Causal Dynamics Models in Object-Oriented Environments
Figure 2 for Learning Causal Dynamics Models in Object-Oriented Environments
Figure 3 for Learning Causal Dynamics Models in Object-Oriented Environments
Figure 4 for Learning Causal Dynamics Models in Object-Oriented Environments
Viaarxiv icon