Picture for Shixuan Liu

Shixuan Liu

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

Add code
Feb 04, 2026
Viaarxiv icon

Detecting Unobserved Confounders: A Kernelized Regression Approach

Add code
Jan 01, 2026
Viaarxiv icon

Learning complete and explainable visual representations from itemized text supervision

Add code
Dec 11, 2025
Viaarxiv icon

CoCo-MILP: Inter-Variable Contrastive and Intra-Constraint Competitive MILP Solution Prediction

Add code
Nov 12, 2025
Viaarxiv icon

GRACE: Generative Representation Learning via Contrastive Policy Optimization

Add code
Oct 06, 2025
Viaarxiv icon

Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security

Add code
Jul 29, 2025
Viaarxiv icon

Group Sequence Policy Optimization

Add code
Jul 24, 2025
Figure 1 for Group Sequence Policy Optimization
Figure 2 for Group Sequence Policy Optimization
Figure 3 for Group Sequence Policy Optimization
Viaarxiv icon

Stable Reinforcement Learning for Efficient Reasoning

Add code
May 23, 2025
Viaarxiv icon

Qwen3 Technical Report

Add code
May 14, 2025
Figure 1 for Qwen3 Technical Report
Figure 2 for Qwen3 Technical Report
Figure 3 for Qwen3 Technical Report
Figure 4 for Qwen3 Technical Report
Viaarxiv icon

From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models

Add code
Mar 08, 2025
Figure 1 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 2 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 3 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 4 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Viaarxiv icon