Picture for Sergey Levine

Sergey Levine

Stanford University

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Add code
Mar 19, 2025
Viaarxiv icon

Dynamic Search for Inference-Time Alignment in Diffusion Models

Add code
Mar 03, 2025
Viaarxiv icon

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models

Add code
Feb 26, 2025
Viaarxiv icon

Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation

Add code
Feb 23, 2025
Viaarxiv icon

Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design

Add code
Feb 20, 2025
Viaarxiv icon

Scaling Test-Time Compute Without Verification or RL is Suboptimal

Add code
Feb 18, 2025
Viaarxiv icon

Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment

Add code
Feb 08, 2025
Figure 1 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Figure 2 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Figure 3 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Figure 4 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Viaarxiv icon

Value-Based Deep RL Scales Predictably

Add code
Feb 06, 2025
Viaarxiv icon

Flow Q-Learning

Add code
Feb 04, 2025
Viaarxiv icon

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Add code
Jan 28, 2025
Figure 1 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 2 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 3 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 4 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Viaarxiv icon