Picture for Zhen Zhao

Zhen Zhao

Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach

Add code
Nov 03, 2024
Viaarxiv icon

UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation

Add code
Oct 14, 2024
Viaarxiv icon

Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

Add code
Aug 19, 2024
Figure 1 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 2 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 3 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 4 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Viaarxiv icon

Harmonizing Visual Text Comprehension and Generation

Add code
Jul 23, 2024
Viaarxiv icon

Depth Anything V2

Add code
Jun 13, 2024
Figure 1 for Depth Anything V2
Figure 2 for Depth Anything V2
Figure 3 for Depth Anything V2
Figure 4 for Depth Anything V2
Viaarxiv icon

A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor

Add code
May 27, 2024
Figure 1 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Figure 2 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Figure 3 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Figure 4 for A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor
Viaarxiv icon

PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis

Add code
May 24, 2024
Viaarxiv icon

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Add code
May 20, 2024
Viaarxiv icon

SOEDiff: Efficient Distillation for Small Object Editing

Add code
May 15, 2024
Viaarxiv icon

Training-Free Unsupervised Prompt for Vision-Language Models

Add code
Apr 25, 2024
Figure 1 for Training-Free Unsupervised Prompt for Vision-Language Models
Figure 2 for Training-Free Unsupervised Prompt for Vision-Language Models
Figure 3 for Training-Free Unsupervised Prompt for Vision-Language Models
Figure 4 for Training-Free Unsupervised Prompt for Vision-Language Models
Viaarxiv icon