Picture for Mingxiao Li

Mingxiao Li

RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents

Add code
Jul 30, 2025
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Viaarxiv icon

Mitigating Negative Interference in Multilingual Sequential Knowledge Editing through Null-Space Constraints

Add code
Jun 12, 2025
Viaarxiv icon

Consistent Story Generation with Asymmetry Zigzag Sampling

Add code
Jun 12, 2025
Viaarxiv icon

VISTA: Enhancing Vision-Text Alignment in MLLMs via Cross-Modal Mutual Information Maximization

Add code
May 19, 2025
Viaarxiv icon

Towards More Accurate Personalized Image Generation: Addressing Overfitting and Evaluation Bias

Add code
Mar 09, 2025
Viaarxiv icon

On a Connection Between Imitation Learning and RLHF

Add code
Mar 07, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

From Visuals to Vocabulary: Establishing Equivalence Between Image and Text Token Through Autoregressive Pre-training in MLLMs

Add code
Feb 13, 2025
Viaarxiv icon

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Add code
Feb 04, 2025
Figure 1 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 2 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 3 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 4 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Viaarxiv icon