Picture for Yifang Xu

Yifang Xu

FaceSnap: Enhanced ID-fidelity Network for Tuning-free Portrait Customization

Add code
Jan 31, 2026
Viaarxiv icon

Diff-PC: Identity-preserving and 3D-aware Controllable Diffusion for Zero-shot Portrait Customization

Add code
Jan 31, 2026
Viaarxiv icon

HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion

Add code
Dec 16, 2025
Figure 1 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 2 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 3 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Figure 4 for HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Viaarxiv icon

ViDA-UGC: Detailed Image Quality Analysis via Visual Distortion Assessment for UGC Images

Add code
Aug 18, 2025
Viaarxiv icon

Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection

Add code
Jan 18, 2025
Figure 1 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Figure 2 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Figure 3 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Figure 4 for Multi-modal Fusion and Query Refinement Network for Video Moment Retrieval and Highlight Detection
Viaarxiv icon

Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models

Add code
Jan 14, 2025
Figure 1 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Figure 2 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Figure 3 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Figure 4 for Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models
Viaarxiv icon

NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

Add code
Apr 17, 2024
Figure 1 for NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Figure 2 for NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Figure 3 for NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Figure 4 for NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Viaarxiv icon

GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features

Add code
Mar 10, 2024
Figure 1 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Figure 2 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Figure 3 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Figure 4 for GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features
Viaarxiv icon

VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT

Add code
Mar 04, 2024
Figure 1 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Figure 2 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Figure 3 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Figure 4 for VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Viaarxiv icon

Pyramid Feature Attention Network for Monocular Depth Prediction

Add code
Mar 03, 2024
Figure 1 for Pyramid Feature Attention Network for Monocular Depth Prediction
Figure 2 for Pyramid Feature Attention Network for Monocular Depth Prediction
Figure 3 for Pyramid Feature Attention Network for Monocular Depth Prediction
Figure 4 for Pyramid Feature Attention Network for Monocular Depth Prediction
Viaarxiv icon