Picture for Pengfei Wan

Pengfei Wan

GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation

Add code
Jun 26, 2025
Viaarxiv icon

SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution

Add code
Jun 24, 2025
Viaarxiv icon

FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation

Add code
Jun 23, 2025
Viaarxiv icon

VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks

Add code
Jun 10, 2025
Viaarxiv icon

FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers

Add code
Jun 05, 2025
Viaarxiv icon

UNIC: Unified In-Context Video Editing

Add code
Jun 04, 2025
Viaarxiv icon

RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

Add code
May 28, 2025
Viaarxiv icon

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Add code
May 27, 2025
Viaarxiv icon

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

Add code
May 27, 2025
Viaarxiv icon

Scaling Image and Video Generation via Test-Time Evolutionary Search

Add code
May 23, 2025
Viaarxiv icon