Picture for Li Yuan

Li Yuan

GPT as a Monte Carlo Language Tree: A Probabilistic Perspective

Add code
Jan 13, 2025
Viaarxiv icon

AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scene

Add code
Jan 07, 2025
Viaarxiv icon

Hierarchical Banzhaf Interaction for General Video-Language Representation Learning

Add code
Dec 30, 2024
Viaarxiv icon

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Add code
Dec 21, 2024
Viaarxiv icon

Next Patch Prediction for Autoregressive Visual Generation

Add code
Dec 19, 2024
Viaarxiv icon

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Add code
Nov 30, 2024
Viaarxiv icon

Open-Sora Plan: Open-Source Large Video Generation Model

Add code
Nov 28, 2024
Figure 1 for Open-Sora Plan: Open-Source Large Video Generation Model
Figure 2 for Open-Sora Plan: Open-Source Large Video Generation Model
Figure 3 for Open-Sora Plan: Open-Source Large Video Generation Model
Figure 4 for Open-Sora Plan: Open-Source Large Video Generation Model
Viaarxiv icon

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Add code
Nov 26, 2024
Figure 1 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Figure 2 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Figure 3 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Figure 4 for Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Viaarxiv icon

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Add code
Nov 26, 2024
Viaarxiv icon

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

Add code
Nov 25, 2024
Viaarxiv icon