Picture for Yiwu Zhong

Yiwu Zhong

AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning

Add code
Dec 04, 2024
Viaarxiv icon

Omni-IML: Towards Unified Image Manipulation Localization

Add code
Nov 22, 2024
Figure 1 for Omni-IML: Towards Unified Image Manipulation Localization
Figure 2 for Omni-IML: Towards Unified Image Manipulation Localization
Figure 3 for Omni-IML: Towards Unified Image Manipulation Localization
Figure 4 for Omni-IML: Towards Unified Image Manipulation Localization
Viaarxiv icon

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Add code
Oct 15, 2024
Figure 1 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 2 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 3 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 4 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Viaarxiv icon

Enhancing Temporal Modeling of Video LLMs via Time Gating

Add code
Oct 08, 2024
Figure 1 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Figure 2 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Figure 3 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Figure 4 for Enhancing Temporal Modeling of Video LLMs via Time Gating
Viaarxiv icon

Generalized Tampered Scene Text Detection in the era of Generative AI

Add code
Jul 31, 2024
Viaarxiv icon

Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models

Add code
Mar 27, 2024
Viaarxiv icon

Towards Learning a Generalist Model for Embodied Navigation

Add code
Dec 06, 2023
Viaarxiv icon

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Add code
Nov 13, 2023
Viaarxiv icon

Robust and Interpretable Medical Image Classifiers via Concept Bottleneck Models

Add code
Oct 04, 2023
Viaarxiv icon

Learning Concise and Descriptive Attributes for Visual Recognition

Add code
Aug 07, 2023
Figure 1 for Learning Concise and Descriptive Attributes for Visual Recognition
Figure 2 for Learning Concise and Descriptive Attributes for Visual Recognition
Figure 3 for Learning Concise and Descriptive Attributes for Visual Recognition
Figure 4 for Learning Concise and Descriptive Attributes for Visual Recognition
Viaarxiv icon