Picture for Ruiqi Zhang

Ruiqi Zhang

Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes

Add code
Apr 05, 2025
Viaarxiv icon

Mitigating Ambiguities in 3D Classification with Gaussian Splatting

Add code
Mar 11, 2025
Viaarxiv icon

How Do LLMs Perform Two-Hop Reasoning in Context?

Add code
Feb 19, 2025
Viaarxiv icon

Fast Best-of-N Decoding via Speculative Rejection

Add code
Oct 26, 2024
Viaarxiv icon

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

Add code
Oct 09, 2024
Viaarxiv icon

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Add code
Apr 08, 2024
Viaarxiv icon

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement

Add code
Feb 24, 2024
Figure 1 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Figure 2 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Figure 3 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Figure 4 for Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Viaarxiv icon

In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization

Add code
Feb 22, 2024
Viaarxiv icon

AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition

Add code
Feb 18, 2024
Viaarxiv icon

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

Add code
Dec 11, 2023
Figure 1 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 2 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 3 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 4 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Viaarxiv icon