Picture for Shoubin Yu

Shoubin Yu

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

Add code
Dec 11, 2024
Viaarxiv icon

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level

Add code
Nov 15, 2024
Figure 1 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Figure 2 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Figure 3 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Figure 4 for Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Viaarxiv icon

SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation

Add code
Oct 16, 2024
Figure 1 for SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Figure 2 for SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Figure 3 for SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Figure 4 for SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Viaarxiv icon

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

Add code
May 29, 2024
Figure 1 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Figure 2 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Figure 3 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Figure 4 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Viaarxiv icon

RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives

Add code
May 28, 2024
Viaarxiv icon

STAR: A Benchmark for Situated Reasoning in Real-World Videos

Add code
May 15, 2024
Viaarxiv icon

CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion

Add code
Feb 08, 2024
Viaarxiv icon

A Simple LLM Framework for Long-Range Video Question-Answering

Add code
Dec 28, 2023
Viaarxiv icon

Self-Chained Image-Language Model for Video Localization and Question Answering

Add code
May 11, 2023
Viaarxiv icon

Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection

Add code
Dec 08, 2021
Figure 1 for Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection
Figure 2 for Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection
Figure 3 for Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection
Figure 4 for Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection
Viaarxiv icon