Picture for Armin Mustafa

Armin Mustafa

Joint Reconstruction of Spatially-Coherent and Realistic Clothed Humans and Objects from a Single Image

Add code
Feb 25, 2025
Viaarxiv icon

Deconstruct Complexity (DeComplex): A Novel Perspective on Tackling Dense Action Detection

Add code
Jan 30, 2025
Viaarxiv icon

Efficient Audio-Visual Fusion for Video Classification

Add code
Nov 08, 2024
Viaarxiv icon

Boosting Camera Motion Control for Video Diffusion Transformers

Add code
Oct 14, 2024
Figure 1 for Boosting Camera Motion Control for Video Diffusion Transformers
Figure 2 for Boosting Camera Motion Control for Video Diffusion Transformers
Figure 3 for Boosting Camera Motion Control for Video Diffusion Transformers
Figure 4 for Boosting Camera Motion Control for Video Diffusion Transformers
Viaarxiv icon

RenDetNet: Weakly-supervised Shadow Detection with Shadow Caster Verification

Add code
Aug 30, 2024
Viaarxiv icon

Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification

Add code
Aug 26, 2024
Figure 1 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Figure 2 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Figure 3 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Figure 4 for Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Viaarxiv icon

Single-image coherent reconstruction of objects and humans

Add code
Aug 15, 2024
Viaarxiv icon

An Effective-Efficient Approach for Dense Multi-Label Action Detection

Add code
Jun 10, 2024
Figure 1 for An Effective-Efficient Approach for Dense Multi-Label Action Detection
Figure 2 for An Effective-Efficient Approach for Dense Multi-Label Action Detection
Figure 3 for An Effective-Efficient Approach for Dense Multi-Label Action Detection
Figure 4 for An Effective-Efficient Approach for Dense Multi-Label Action Detection
Viaarxiv icon

NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative

Add code
Jun 10, 2024
Viaarxiv icon

CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

Add code
May 17, 2024
Viaarxiv icon