Picture for Enhua Wu

Enhua Wu

Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMs

Add code
Oct 14, 2024
Viaarxiv icon

Denoising with a Joint-Embedding Predictive Architecture

Add code
Oct 02, 2024
Figure 1 for Denoising with a Joint-Embedding Predictive Architecture
Figure 2 for Denoising with a Joint-Embedding Predictive Architecture
Figure 3 for Denoising with a Joint-Embedding Predictive Architecture
Figure 4 for Denoising with a Joint-Embedding Predictive Architecture
Viaarxiv icon

Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input

Add code
Aug 28, 2024
Figure 1 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Figure 2 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Figure 3 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Figure 4 for Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
Viaarxiv icon

Deformable 3D Shape Diffusion Model

Add code
Jul 31, 2024
Viaarxiv icon

Fine-gained Zero-shot Video Sampling

Add code
Jul 31, 2024
Viaarxiv icon

Space-time Reinforcement Network for Video Object Segmentation

Add code
May 07, 2024
Viaarxiv icon

ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks

Add code
Jun 26, 2023
Viaarxiv icon

Robust and Efficient Memory Network for Video Object Segmentation

Add code
Apr 24, 2023
Viaarxiv icon

Bag of Tricks with Quantized Convolutional Neural Networks for image classification

Add code
Mar 13, 2023
Viaarxiv icon

3D Human Pose Lifting with Grid Convolution

Add code
Feb 17, 2023
Viaarxiv icon