Picture for Zhi Gao

Zhi Gao

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China

MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

Add code
Feb 27, 2025
Viaarxiv icon

Large-Scale Riemannian Meta-Optimization via Subspace Adaptation

Add code
Jan 25, 2025
Viaarxiv icon

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

Add code
Dec 20, 2024
Viaarxiv icon

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

Add code
Jul 16, 2024
Viaarxiv icon

A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement

Add code
Mar 28, 2024
Figure 1 for A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement
Figure 2 for A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement
Figure 3 for A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement
Figure 4 for A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement
Viaarxiv icon

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Add code
Mar 18, 2024
Viaarxiv icon

Global-Local MAV Detection under Challenging Conditions based on Appearance and Motion

Add code
Dec 18, 2023
Figure 1 for Global-Local MAV Detection under Challenging Conditions based on Appearance and Motion
Figure 2 for Global-Local MAV Detection under Challenging Conditions based on Appearance and Motion
Figure 3 for Global-Local MAV Detection under Challenging Conditions based on Appearance and Motion
Figure 4 for Global-Local MAV Detection under Challenging Conditions based on Appearance and Motion
Viaarxiv icon

CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update

Add code
Dec 18, 2023
Viaarxiv icon

Exploring Data Geometry for Continual Learning

Add code
Apr 08, 2023
Viaarxiv icon

Meta-causal Learning for Single Domain Generalization

Add code
Apr 07, 2023
Viaarxiv icon