Picture for Tianyi Lin

Tianyi Lin

R1-SyntheticVL: Is Synthetic Data from Generative Models Ready for Multimodal Large Language Model?

Add code
Feb 03, 2026
Viaarxiv icon

Reward-free Alignment for Conflicting Objectives

Add code
Feb 02, 2026
Viaarxiv icon

Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Add code
Dec 21, 2025
Figure 1 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Figure 2 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Figure 3 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Figure 4 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Viaarxiv icon

Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Add code
May 16, 2025
Viaarxiv icon

ComPO: Preference Alignment via Comparison Oracles

Add code
May 08, 2025
Viaarxiv icon

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

Add code
Aug 21, 2024
Viaarxiv icon

Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback

Add code
Oct 24, 2023
Viaarxiv icon

A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport

Add code
Oct 21, 2023
Viaarxiv icon

Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds

Add code
Jun 29, 2023
Viaarxiv icon

Deterministic Nonsmooth Nonconvex Optimization

Add code
Feb 16, 2023
Viaarxiv icon