Picture for Sijie Zhao

Sijie Zhao

StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos

Add code
Sep 11, 2024
Figure 1 for StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Figure 2 for StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Figure 3 for StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Figure 4 for StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Viaarxiv icon

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Add code
Sep 03, 2024
Figure 1 for DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Figure 2 for DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Figure 3 for DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Figure 4 for DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Viaarxiv icon

VegeDiff: Latent Diffusion Model for Geospatial Vegetation Forecasting

Add code
Jul 17, 2024
Viaarxiv icon

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Add code
May 30, 2024
Viaarxiv icon

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Add code
May 07, 2024
Viaarxiv icon

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

Add code
Apr 22, 2024
Viaarxiv icon

RS-Mamba for Large Remote Sensing Image Dense Prediction

Add code
Apr 10, 2024
Figure 1 for RS-Mamba for Large Remote Sensing Image Dense Prediction
Figure 2 for RS-Mamba for Large Remote Sensing Image Dense Prediction
Figure 3 for RS-Mamba for Large Remote Sensing Image Dense Prediction
Figure 4 for RS-Mamba for Large Remote Sensing Image Dense Prediction
Viaarxiv icon

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

Add code
Dec 14, 2023
Viaarxiv icon

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

Add code
Nov 27, 2023
Figure 1 for UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Figure 2 for UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Figure 3 for UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Figure 4 for UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Viaarxiv icon

Exchanging Dual Encoder-Decoder: A New Strategy for Change Detection with Semantic Guidance and Spatial Localization

Add code
Nov 19, 2023
Viaarxiv icon