Picture for Mengze Li

Mengze Li

Semantic Alignment for Multimodal Large Language Models

Add code
Aug 23, 2024
Figure 1 for Semantic Alignment for Multimodal Large Language Models
Figure 2 for Semantic Alignment for Multimodal Large Language Models
Figure 3 for Semantic Alignment for Multimodal Large Language Models
Figure 4 for Semantic Alignment for Multimodal Large Language Models
Viaarxiv icon

Generalization Gap in Data Augmentation: Insights from Illumination

Add code
Apr 11, 2024
Viaarxiv icon

Enabling Collaborative Clinical Diagnosis of Infectious Keratitis by Integrating Expert Knowledge and Interpretable Data-driven Intelligence

Add code
Jan 14, 2024
Viaarxiv icon

Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer

Add code
Nov 21, 2023
Viaarxiv icon

Panoptic Scene Graph Generation with Semantics-prototype Learning

Add code
Jul 28, 2023
Viaarxiv icon

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

Add code
Mar 12, 2023
Figure 1 for Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
Figure 2 for Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
Figure 3 for Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
Figure 4 for Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
Viaarxiv icon

IDEAL: Toward High-efficiency Device-Cloud Collaborative and Dynamic Recommendation System

Add code
Feb 14, 2023
Viaarxiv icon

Enhancing Fairness of Visual Attribute Predictors

Add code
Jul 14, 2022
Figure 1 for Enhancing Fairness of Visual Attribute Predictors
Figure 2 for Enhancing Fairness of Visual Attribute Predictors
Figure 3 for Enhancing Fairness of Visual Attribute Predictors
Figure 4 for Enhancing Fairness of Visual Attribute Predictors
Viaarxiv icon

BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval

Add code
Jul 09, 2022
Figure 1 for BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Figure 2 for BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Figure 3 for BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Figure 4 for BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Viaarxiv icon

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

Add code
Mar 15, 2022
Figure 1 for End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Figure 2 for End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Figure 3 for End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Figure 4 for End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Viaarxiv icon