Picture for Zechao Li

Zechao Li

Visual Position Prompt for MLLM based Visual Grounding

Add code
Mar 19, 2025
Viaarxiv icon

A Comprehensive Survey on Visual Concept Mining in Text-to-image Diffusion Models

Add code
Mar 17, 2025
Viaarxiv icon

OT-DETECTOR: Delving into Optimal Transport for Zero-shot Out-of-Distribution Detection

Add code
Mar 09, 2025
Viaarxiv icon

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly

Add code
Feb 09, 2025
Viaarxiv icon

AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic Segmentation

Add code
Dec 23, 2024
Viaarxiv icon

Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Add code
Dec 18, 2024
Figure 1 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Figure 2 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Figure 3 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Figure 4 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Viaarxiv icon

Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection

Add code
Dec 02, 2024
Viaarxiv icon

EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond

Add code
Nov 27, 2024
Figure 1 for EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond
Figure 2 for EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond
Figure 3 for EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond
Figure 4 for EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond
Viaarxiv icon

Fast Disentangled Slim Tensor Learning for Multi-view Clustering

Add code
Nov 12, 2024
Figure 1 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Figure 2 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Figure 3 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Figure 4 for Fast Disentangled Slim Tensor Learning for Multi-view Clustering
Viaarxiv icon

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Add code
Oct 17, 2024
Figure 1 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 2 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 3 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 4 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Viaarxiv icon