Picture for Rujie Wu

Rujie Wu

LongViTU: Instruction Tuning for Long-Form Video Understanding

Add code
Jan 09, 2025
Viaarxiv icon

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

Add code
Dec 31, 2024
Figure 1 for Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
Figure 2 for Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
Figure 3 for Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
Figure 4 for Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
Viaarxiv icon

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Add code
Jul 07, 2024
Viaarxiv icon

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Add code
Mar 18, 2024
Viaarxiv icon

Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World

Add code
Oct 16, 2023
Viaarxiv icon

Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

Add code
Jul 22, 2022
Figure 1 for Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
Figure 2 for Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
Figure 3 for Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
Figure 4 for Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
Viaarxiv icon