Picture for Yuxuan Cai

Yuxuan Cai

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Add code
Dec 02, 2024
Figure 1 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Figure 2 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Figure 3 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Figure 4 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Viaarxiv icon

Fleximo: Towards Flexible Text-to-Human Motion Video Generation

Add code
Nov 29, 2024
Viaarxiv icon

Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention

Add code
Nov 28, 2024
Viaarxiv icon

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Add code
Nov 24, 2024
Figure 1 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Figure 2 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Figure 3 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Figure 4 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Viaarxiv icon

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Add code
Oct 21, 2024
Viaarxiv icon

Allegro: Open the Black Box of Commercial-Level Video Generation Model

Add code
Oct 20, 2024
Figure 1 for Allegro: Open the Black Box of Commercial-Level Video Generation Model
Figure 2 for Allegro: Open the Black Box of Commercial-Level Video Generation Model
Figure 3 for Allegro: Open the Black Box of Commercial-Level Video Generation Model
Figure 4 for Allegro: Open the Black Box of Commercial-Level Video Generation Model
Viaarxiv icon

Attention-Guided Perturbation for Unsupervised Image Anomaly Detection

Add code
Aug 14, 2024
Viaarxiv icon

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

Add code
Jun 06, 2024
Viaarxiv icon

High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost

Add code
May 26, 2024
Viaarxiv icon

Anomaly Detection by Adapting a pre-trained Vision Language Model

Add code
Mar 14, 2024
Viaarxiv icon