Picture for Shutong Jin

Shutong Jin

PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement

Add code
Oct 29, 2024
Viaarxiv icon

Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies

Add code
Sep 30, 2024
Figure 1 for Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
Figure 2 for Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
Figure 3 for Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
Figure 4 for Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
Viaarxiv icon

RealCraft: Attention Control as A Solution for Zero-shot Long Video Editing

Add code
Dec 21, 2023
Viaarxiv icon

Video Transformers under Occlusion: How Physics and Background Attributes Impact Large Models for Robotic Manipulation

Add code
Oct 11, 2023
Viaarxiv icon