Picture for Zhenyu Tang

Zhenyu Tang

AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scene

Add code
Jan 07, 2025
Viaarxiv icon

Next Patch Prediction for Autoregressive Visual Generation

Add code
Dec 19, 2024
Viaarxiv icon

Open-Sora Plan: Open-Source Large Video Generation Model

Add code
Nov 28, 2024
Figure 1 for Open-Sora Plan: Open-Source Large Video Generation Model
Figure 2 for Open-Sora Plan: Open-Source Large Video Generation Model
Figure 3 for Open-Sora Plan: Open-Source Large Video Generation Model
Figure 4 for Open-Sora Plan: Open-Source Large Video Generation Model
Viaarxiv icon

BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI

Add code
Oct 14, 2024
Figure 1 for BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI
Figure 2 for BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI
Figure 3 for BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI
Figure 4 for BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI
Viaarxiv icon

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle

Add code
Jul 28, 2024
Viaarxiv icon

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Add code
Jun 06, 2024
Figure 1 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Figure 2 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Figure 3 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Figure 4 for ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Viaarxiv icon

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing

Add code
Apr 11, 2024
Viaarxiv icon

Envision3D: One Image to 3D with Anchor Views Interpolation

Add code
Mar 13, 2024
Viaarxiv icon

LLMBind: A Unified Modality-Task Integration Framework

Add code
Mar 08, 2024
Viaarxiv icon

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Add code
Feb 04, 2024
Figure 1 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Figure 2 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Figure 3 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Figure 4 for MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Viaarxiv icon