Picture for Wei-Chiu Ma

Wei-Chiu Ma

From an Image to a Scene: Learning to Imagine the World from a Million 360 Videos

Add code
Dec 10, 2024
Viaarxiv icon

Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering

Add code
Dec 05, 2024
Viaarxiv icon

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model

Add code
Aug 01, 2024
Figure 1 for Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model
Figure 2 for Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model
Figure 3 for Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model
Figure 4 for Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model
Viaarxiv icon

Task Me Anything

Add code
Jun 17, 2024
Figure 1 for Task Me Anything
Figure 2 for Task Me Anything
Figure 3 for Task Me Anything
Figure 4 for Task Me Anything
Viaarxiv icon

Preserving Identity with Variational Score for General-purpose 3D Editing

Add code
Jun 13, 2024
Viaarxiv icon

ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models

Add code
Jun 10, 2024
Viaarxiv icon

Multilingual Diversity Improves Vision-Language Representations

Add code
May 27, 2024
Viaarxiv icon

BLINK: Multimodal Large Language Models Can See but Not Perceive

Add code
Apr 18, 2024
Figure 1 for BLINK: Multimodal Large Language Models Can See but Not Perceive
Figure 2 for BLINK: Multimodal Large Language Models Can See but Not Perceive
Figure 3 for BLINK: Multimodal Large Language Models Can See but Not Perceive
Figure 4 for BLINK: Multimodal Large Language Models Can See but Not Perceive
Viaarxiv icon

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

Add code
Apr 15, 2024
Viaarxiv icon

Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects

Add code
Jan 10, 2024
Viaarxiv icon