Picture for Ruiqi Zhang

Ruiqi Zhang

Fast Best-of-N Decoding via Speculative Rejection

Add code
Oct 26, 2024
Viaarxiv icon

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

Add code
Oct 09, 2024
Viaarxiv icon

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Add code
Apr 08, 2024
Viaarxiv icon

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement

Add code
Feb 24, 2024
Viaarxiv icon

In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization

Add code
Feb 22, 2024
Viaarxiv icon

AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition

Add code
Feb 18, 2024
Viaarxiv icon

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

Add code
Dec 11, 2023
Figure 1 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 2 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 3 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 4 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Viaarxiv icon

Explicifying Neural Implicit Fields for Efficient Dynamic Human Avatar Modeling via a Neural Explicit Surface

Add code
Aug 07, 2023
Viaarxiv icon

Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

Add code
Jul 10, 2023
Viaarxiv icon

Trained Transformers Learn Linear Models In-Context

Add code
Jun 16, 2023
Viaarxiv icon