Picture for Zhen Zhao

Zhen Zhao

Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach

Add code
Nov 03, 2024
Viaarxiv icon

UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation

Add code
Oct 14, 2024
Figure 1 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Figure 2 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Figure 3 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Figure 4 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Viaarxiv icon

Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

Add code
Aug 19, 2024
Figure 1 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 2 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 3 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 4 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Viaarxiv icon

Harmonizing Visual Text Comprehension and Generation

Add code
Jul 23, 2024
Figure 1 for Harmonizing Visual Text Comprehension and Generation
Figure 2 for Harmonizing Visual Text Comprehension and Generation
Figure 3 for Harmonizing Visual Text Comprehension and Generation
Figure 4 for Harmonizing Visual Text Comprehension and Generation
Viaarxiv icon

Depth Anything V2

Add code
Jun 13, 2024
Figure 1 for Depth Anything V2
Figure 2 for Depth Anything V2
Figure 3 for Depth Anything V2
Figure 4 for Depth Anything V2
Viaarxiv icon

A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor

Add code
May 27, 2024
Figure 1 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Figure 2 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Figure 3 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Figure 4 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Viaarxiv icon

PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis

Add code
May 24, 2024
Viaarxiv icon

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Add code
May 20, 2024
Viaarxiv icon

SOEDiff: Efficient Distillation for Small Object Editing

Add code
May 15, 2024
Viaarxiv icon

Training-Free Unsupervised Prompt for Vision-Language Models

Add code
Apr 25, 2024
Figure 1 for Training-Free Unsupervised Prompt for Vision-Language Models
Figure 2 for Training-Free Unsupervised Prompt for Vision-Language Models
Figure 3 for Training-Free Unsupervised Prompt for Vision-Language Models
Figure 4 for Training-Free Unsupervised Prompt for Vision-Language Models
Viaarxiv icon