Picture for Carl Doersch

Carl Doersch

Derek

Scaling 4D Representations

Add code
Dec 19, 2024
Figure 1 for Scaling 4D Representations
Figure 2 for Scaling 4D Representations
Figure 3 for Scaling 4D Representations
Figure 4 for Scaling 4D Representations
Viaarxiv icon

Motion Prompting: Controlling Video Generation with Motion Trajectories

Add code
Dec 03, 2024
Viaarxiv icon

Moving Off-the-Grid: Scene-Grounded Video Representations

Add code
Nov 08, 2024
Figure 1 for Moving Off-the-Grid: Scene-Grounded Video Representations
Figure 2 for Moving Off-the-Grid: Scene-Grounded Video Representations
Figure 3 for Moving Off-the-Grid: Scene-Grounded Video Representations
Figure 4 for Moving Off-the-Grid: Scene-Grounded Video Representations
Viaarxiv icon

Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation

Add code
Sep 24, 2024
Figure 1 for Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Figure 2 for Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Figure 3 for Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Figure 4 for Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Viaarxiv icon

TAPVid-3D: A Benchmark for Tracking Any Point in 3D

Add code
Jul 08, 2024
Figure 1 for TAPVid-3D: A Benchmark for Tracking Any Point in 3D
Figure 2 for TAPVid-3D: A Benchmark for Tracking Any Point in 3D
Figure 3 for TAPVid-3D: A Benchmark for Tracking Any Point in 3D
Figure 4 for TAPVid-3D: A Benchmark for Tracking Any Point in 3D
Viaarxiv icon

BootsTAP: Bootstrapped Training for Tracking-Any-Point

Add code
Feb 01, 2024
Figure 1 for BootsTAP: Bootstrapped Training for Tracking-Any-Point
Figure 2 for BootsTAP: Bootstrapped Training for Tracking-Any-Point
Figure 3 for BootsTAP: Bootstrapped Training for Tracking-Any-Point
Figure 4 for BootsTAP: Bootstrapped Training for Tracking-Any-Point
Viaarxiv icon

Learning from One Continuous Video Stream

Add code
Dec 01, 2023
Figure 1 for Learning from One Continuous Video Stream
Figure 2 for Learning from One Continuous Video Stream
Figure 3 for Learning from One Continuous Video Stream
Figure 4 for Learning from One Continuous Video Stream
Viaarxiv icon

RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

Add code
Aug 31, 2023
Viaarxiv icon

TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement

Add code
Jun 14, 2023
Viaarxiv icon

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

Add code
May 23, 2023
Figure 1 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 2 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 3 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 4 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Viaarxiv icon