Picture for Daquan Zhou

Daquan Zhou

LVD-2M: A Long-take Video Dataset with Temporally Dense Captions

Add code
Oct 14, 2024
Viaarxiv icon

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Add code
Oct 03, 2024
Viaarxiv icon

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Add code
May 02, 2024
Viaarxiv icon

PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

Add code
Apr 29, 2024
Viaarxiv icon

Chain of Thought Explanation for Dialogue State Tracking

Add code
Mar 09, 2024
Viaarxiv icon

Sora Generates Videos with Stunning Geometrical Consistency

Add code
Feb 27, 2024
Viaarxiv icon

Magic-Me: Identity-Specific Video Customized Diffusion

Add code
Feb 14, 2024
Viaarxiv icon

MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

Add code
Jan 09, 2024
Viaarxiv icon

Factorization Vision Transformer: Modeling Long Range Dependency with Local Window Cost

Add code
Dec 14, 2023
Viaarxiv icon

MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration

Add code
Nov 16, 2023
Figure 1 for MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
Figure 2 for MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
Figure 3 for MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
Figure 4 for MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
Viaarxiv icon