Picture for Xingrui Wang

Xingrui Wang

University of Southern California

PulseCheck457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models

Add code
Feb 13, 2025
Viaarxiv icon

PulseCheck457: A Diagnostic Benchmark for Comprehensive Spatial Reasoning of Large Multimodal Models

Add code
Feb 12, 2025
Viaarxiv icon

GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Add code
Dec 22, 2024
Viaarxiv icon

TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation

Add code
Dec 13, 2024
Figure 1 for TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation
Figure 2 for TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation
Figure 3 for TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation
Figure 4 for TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation
Viaarxiv icon

MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration

Add code
Jul 15, 2024
Viaarxiv icon

CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion

Add code
Jun 07, 2024
Viaarxiv icon

Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering

Add code
Jun 02, 2024
Figure 1 for Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Figure 2 for Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Figure 3 for Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Figure 4 for Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Viaarxiv icon

3D-Aware Visual Question Answering about Parts, Poses and Occlusions

Add code
Oct 27, 2023
Viaarxiv icon

Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey

Add code
Aug 18, 2023
Viaarxiv icon

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

Add code
Dec 01, 2022
Viaarxiv icon