Picture for Nong Sang

Nong Sang

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Add code
Oct 17, 2024
Figure 1 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 2 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 3 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 4 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Viaarxiv icon

DFIMat: Decoupled Flexible Interactive Matting in Multi-Person Scenarios

Add code
Oct 13, 2024
Viaarxiv icon

Replace Anyone in Videos

Add code
Sep 30, 2024
Figure 1 for Replace Anyone in Videos
Figure 2 for Replace Anyone in Videos
Figure 3 for Replace Anyone in Videos
Viaarxiv icon

Cross-video Identity Correlating for Person Re-identification Pre-training

Add code
Sep 27, 2024
Viaarxiv icon

Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM

Add code
Jun 18, 2024
Viaarxiv icon

UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation

Add code
Jun 03, 2024
Viaarxiv icon

Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos

Add code
Apr 26, 2024
Viaarxiv icon

REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for Noisy Correspondence

Add code
Mar 13, 2024
Viaarxiv icon

GlanceVAD: Exploring Glance Supervision for Label-efficient Video Anomaly Detection

Add code
Mar 12, 2024
Viaarxiv icon

Spatial Cascaded Clustering and Weighted Memory for Unsupervised Person Re-identification

Add code
Mar 01, 2024
Viaarxiv icon