Picture for Kai Zhang

Kai Zhang

Victor

Generating 3D-Consistent Videos from Unposed Internet Photos

Add code
Nov 20, 2024
Viaarxiv icon

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Add code
Nov 05, 2024
Viaarxiv icon

AAAR-1.0: Assessing AI's Potential to Assist Research

Add code
Oct 29, 2024
Viaarxiv icon

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

Add code
Oct 22, 2024
Viaarxiv icon

MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging

Add code
Oct 18, 2024
Viaarxiv icon

Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models

Add code
Oct 17, 2024
Viaarxiv icon

Revealing the Barriers of Language Agents in Planning

Add code
Oct 16, 2024
Viaarxiv icon

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats

Add code
Oct 16, 2024
Figure 1 for Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
Figure 2 for Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
Figure 3 for Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
Figure 4 for Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
Viaarxiv icon

DeltaDock: A Unified Framework for Accurate, Efficient, and Physically Reliable Molecular Docking

Add code
Oct 15, 2024
Viaarxiv icon

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Add code
Oct 15, 2024
Figure 1 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 2 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 3 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 4 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Viaarxiv icon