Picture for Sidan Du

Sidan Du

FaceSnap: Enhanced ID-fidelity Network for Tuning-free Portrait Customization

Add code
Jan 31, 2026
Viaarxiv icon

Diff-PC: Identity-preserving and 3D-aware Controllable Diffusion for Zero-shot Portrait Customization

Add code
Jan 31, 2026
Viaarxiv icon

HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion

Add code
Dec 16, 2025
Figure 1 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 2 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 3 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 4 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Viaarxiv icon

Uni-Inter: Unifying 3D Human Motion Synthesis Across Diverse Interaction Contexts

Add code
Nov 17, 2025
Viaarxiv icon

Free3D: 3D Human Motion Emerges from Single-View 2D Supervision

Add code
Nov 14, 2025
Viaarxiv icon

Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection

Add code
Jan 18, 2025
Figure 1 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Figure 2 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Figure 3 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Figure 4 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Viaarxiv icon

Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models

Add code
Jan 14, 2025
Figure 1 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Figure 2 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Figure 3 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Figure 4 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Viaarxiv icon

Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes

Add code
Sep 12, 2024
Figure 1 for Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes
Figure 2 for Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes
Figure 3 for Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes
Figure 4 for Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes
Viaarxiv icon

GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features

Add code
Mar 10, 2024
Figure 1 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Figure 2 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Figure 3 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Figure 4 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Viaarxiv icon

VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT

Add code
Mar 04, 2024
Figure 1 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Figure 2 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Figure 3 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Figure 4 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Viaarxiv icon