Picture for Guangcong Wang

Guangcong Wang

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Add code
Mar 26, 2025
Viaarxiv icon

DiffV2IR: Visible-to-Infrared Diffusion Model via Vision-Language Understanding

Add code
Mar 24, 2025
Viaarxiv icon

HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding

Add code
Jan 03, 2025
Figure 1 for HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding
Figure 2 for HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding
Figure 3 for HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding
Figure 4 for HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding
Viaarxiv icon

GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting

Add code
Dec 02, 2024
Figure 1 for GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Figure 2 for GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Figure 3 for GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Figure 4 for GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Viaarxiv icon

From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding

Add code
Sep 27, 2024
Figure 1 for From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
Figure 2 for From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
Figure 3 for From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
Figure 4 for From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
Viaarxiv icon

Text-based Talking Video Editing with Cascaded Conditional Diffusion

Add code
Jul 20, 2024
Figure 1 for Text-based Talking Video Editing with Cascaded Conditional Diffusion
Figure 2 for Text-based Talking Video Editing with Cascaded Conditional Diffusion
Figure 3 for Text-based Talking Video Editing with Cascaded Conditional Diffusion
Figure 4 for Text-based Talking Video Editing with Cascaded Conditional Diffusion
Viaarxiv icon

WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation

Add code
Jul 02, 2024
Figure 1 for WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation
Figure 2 for WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation
Figure 3 for WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation
Figure 4 for WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation
Viaarxiv icon

Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Add code
May 20, 2024
Figure 1 for Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Figure 2 for Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Figure 3 for Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Figure 4 for Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Viaarxiv icon

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation

Add code
Dec 07, 2023
Viaarxiv icon

PERF: Panoramic Neural Radiance Field from a Single Panorama

Add code
Oct 28, 2023
Figure 1 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Figure 2 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Figure 3 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Figure 4 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Viaarxiv icon