Picture for Zhen Lei

Zhen Lei

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Add code
Apr 01, 2025
Viaarxiv icon

Mixture-of-Attack-Experts with Class Regularization for Unified Physical-Digital Face Attack Detection

Add code
Apr 01, 2025
Viaarxiv icon

FA^{3}-CLIP: Frequency-Aware Cues Fusion and Attack-Agnostic Prompt Learning for Unified Face Attack Detection

Add code
Apr 01, 2025
Viaarxiv icon

Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data

Add code
Mar 27, 2025
Viaarxiv icon

PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation

Add code
Mar 20, 2025
Viaarxiv icon

Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport

Add code
Mar 19, 2025
Viaarxiv icon

Bayesian Test-Time Adaptation for Vision-Language Models

Add code
Mar 12, 2025
Viaarxiv icon

SRM-Hair: Single Image Head Mesh Reconstruction via 3D Morphable Hair

Add code
Mar 08, 2025
Viaarxiv icon

EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery

Add code
Jan 20, 2025
Figure 1 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 2 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 3 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 4 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Viaarxiv icon

WMamba: Wavelet-based Mamba for Face Forgery Detection

Add code
Jan 16, 2025
Figure 1 for WMamba: Wavelet-based Mamba for Face Forgery Detection
Figure 2 for WMamba: Wavelet-based Mamba for Face Forgery Detection
Figure 3 for WMamba: Wavelet-based Mamba for Face Forgery Detection
Figure 4 for WMamba: Wavelet-based Mamba for Face Forgery Detection
Viaarxiv icon