Picture for Ruida Zhou

Ruida Zhou

DISPO: Enhancing Training Efficiency and Stability in Reinforcement Learning for Large Language Model Mathematical Reasoning

Add code
Feb 01, 2026
Viaarxiv icon

Direct Preference Optimization with Rating Information: Practical Algorithms and Provable Gains

Add code
Jan 31, 2026
Viaarxiv icon

SPIRE: Conditional Personalization for Federated Diffusion Generative Models

Add code
Jun 14, 2025
Viaarxiv icon

Cost-Aware Optimal Pairwise Pure Exploration

Add code
Mar 10, 2025
Viaarxiv icon

Path-Guided Particle-based Sampling

Add code
Dec 04, 2024
Figure 1 for Path-Guided Particle-based Sampling
Figure 2 for Path-Guided Particle-based Sampling
Figure 3 for Path-Guided Particle-based Sampling
Figure 4 for Path-Guided Particle-based Sampling
Viaarxiv icon

On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery

Add code
Oct 17, 2024
Viaarxiv icon

On the Training Convergence of Transformers for In-Context Classification

Add code
Oct 15, 2024
Viaarxiv icon

Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning

Add code
Oct 15, 2024
Viaarxiv icon

Transformers learn variable-order Markov chains in-context

Add code
Oct 07, 2024
Viaarxiv icon

Reframing Data Value for Large Language Models Through the Lens of Plausability

Add code
Aug 30, 2024
Figure 1 for Reframing Data Value for Large Language Models Through the Lens of Plausability
Figure 2 for Reframing Data Value for Large Language Models Through the Lens of Plausability
Figure 3 for Reframing Data Value for Large Language Models Through the Lens of Plausability
Figure 4 for Reframing Data Value for Large Language Models Through the Lens of Plausability
Viaarxiv icon