Picture for Zilong Huang

Zilong Huang

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Add code
Oct 13, 2024
Figure 1 for LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Figure 2 for LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Figure 3 for LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Figure 4 for LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Viaarxiv icon

CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis

Add code
Aug 27, 2024
Figure 1 for CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis
Figure 2 for CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis
Figure 3 for CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis
Figure 4 for CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis
Viaarxiv icon

Lightweight Model Pre-training via Language Guided Knowledge Distillation

Add code
Jun 17, 2024
Viaarxiv icon

Depth Anything V2

Add code
Jun 13, 2024
Figure 1 for Depth Anything V2
Figure 2 for Depth Anything V2
Figure 3 for Depth Anything V2
Figure 4 for Depth Anything V2
Viaarxiv icon

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Add code
May 28, 2024
Viaarxiv icon

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Add code
Jan 19, 2024
Viaarxiv icon

Harnessing Diffusion Models for Visual Perception with Meta Prompts

Add code
Dec 22, 2023
Viaarxiv icon

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

Add code
Jul 17, 2023
Figure 1 for BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Figure 2 for BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Figure 3 for BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Figure 4 for BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Viaarxiv icon

Disentangled Pre-training for Image Matting

Add code
Apr 03, 2023
Figure 1 for Disentangled Pre-training for Image Matting
Figure 2 for Disentangled Pre-training for Image Matting
Figure 3 for Disentangled Pre-training for Image Matting
Figure 4 for Disentangled Pre-training for Image Matting
Viaarxiv icon

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

Add code
Feb 09, 2023
Viaarxiv icon