Picture for Zilun Zhang

Zilun Zhang

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

Add code
Nov 25, 2024
Figure 1 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 2 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 3 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 4 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Viaarxiv icon

Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG

Add code
Nov 12, 2024
Figure 1 for Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
Figure 2 for Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
Figure 3 for Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
Viaarxiv icon

Preserving Knowledge in Large Language Model: A Model-Agnostic Self-Decompression Approach

Add code
Jun 17, 2024
Viaarxiv icon

RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model

Add code
Jun 20, 2023
Viaarxiv icon

Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input

Add code
Oct 03, 2022
Figure 1 for Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input
Figure 2 for Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input
Figure 3 for Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input
Figure 4 for Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input
Viaarxiv icon

Injecting Image Details into CLIP's Feature Space

Add code
Aug 31, 2022
Figure 1 for Injecting Image Details into CLIP's Feature Space
Figure 2 for Injecting Image Details into CLIP's Feature Space
Figure 3 for Injecting Image Details into CLIP's Feature Space
Figure 4 for Injecting Image Details into CLIP's Feature Space
Viaarxiv icon

Will Multi-modal Data Improves Few-shot Learning?

Add code
Jul 25, 2021
Figure 1 for Will Multi-modal Data Improves Few-shot Learning?
Figure 2 for Will Multi-modal Data Improves Few-shot Learning?
Figure 3 for Will Multi-modal Data Improves Few-shot Learning?
Figure 4 for Will Multi-modal Data Improves Few-shot Learning?
Viaarxiv icon

DPGN: Distribution Propagation Graph Network for Few-shot Learning

Add code
Apr 01, 2020
Figure 1 for DPGN: Distribution Propagation Graph Network for Few-shot Learning
Figure 2 for DPGN: Distribution Propagation Graph Network for Few-shot Learning
Figure 3 for DPGN: Distribution Propagation Graph Network for Few-shot Learning
Figure 4 for DPGN: Distribution Propagation Graph Network for Few-shot Learning
Viaarxiv icon