Picture for Sergey Levine

Sergey Levine

Stanford University

Scaling Test-Time Compute Without Verification or RL is Suboptimal

Add code
Feb 18, 2025
Viaarxiv icon

Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment

Add code
Feb 08, 2025
Figure 1 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Figure 2 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Figure 3 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Figure 4 for Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following Temporal Representation Alignment
Viaarxiv icon

Value-Based Deep RL Scales Predictably

Add code
Feb 06, 2025
Viaarxiv icon

Flow Q-Learning

Add code
Feb 04, 2025
Viaarxiv icon

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Add code
Jan 28, 2025
Figure 1 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 2 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 3 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Figure 4 for SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Viaarxiv icon

FAST: Efficient Action Tokenization for Vision-Language-Action Models

Add code
Jan 16, 2025
Viaarxiv icon

Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review

Add code
Jan 16, 2025
Figure 1 for Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review
Figure 2 for Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review
Figure 3 for Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review
Figure 4 for Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review
Viaarxiv icon

Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding

Add code
Jan 08, 2025
Viaarxiv icon

Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Add code
Dec 17, 2024
Viaarxiv icon

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Add code
Dec 13, 2024
Viaarxiv icon