Picture for Xiangchao Yan

Xiangchao Yan

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Add code
Oct 13, 2024
Figure 1 for Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Figure 2 for Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Figure 3 for Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Figure 4 for Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Viaarxiv icon

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

Add code
Jun 17, 2024
Figure 1 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 2 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 3 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 4 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Viaarxiv icon

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 13, 2024
Figure 1 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon

OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 12, 2024
Figure 1 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Add code
Apr 29, 2024
Viaarxiv icon

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

Add code
Feb 19, 2024
Viaarxiv icon

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

Add code
Sep 25, 2023
Figure 1 for ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
Figure 2 for ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
Figure 3 for ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
Figure 4 for ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
Viaarxiv icon

SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

Add code
Sep 25, 2023
Viaarxiv icon

AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset

Add code
Jun 01, 2023
Viaarxiv icon

Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection

Add code
Mar 10, 2023
Figure 1 for Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
Figure 2 for Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
Figure 3 for Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
Figure 4 for Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
Viaarxiv icon