Picture for Meng Cao

Meng Cao

Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models

Add code
Nov 04, 2024
Viaarxiv icon

Synth4Seg -- Learning Defect Data Synthesis for Defect Segmentation using Bi-level Optimization

Add code
Oct 24, 2024
Viaarxiv icon

How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?

Add code
Oct 23, 2024
Viaarxiv icon

ING-VP: MLLMs cannot Play Easy Vision-based Games Yet

Add code
Oct 09, 2024
Figure 1 for ING-VP: MLLMs cannot Play Easy Vision-based Games Yet
Figure 2 for ING-VP: MLLMs cannot Play Easy Vision-based Games Yet
Figure 3 for ING-VP: MLLMs cannot Play Easy Vision-based Games Yet
Figure 4 for ING-VP: MLLMs cannot Play Easy Vision-based Games Yet
Viaarxiv icon

TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights

Add code
Oct 06, 2024
Figure 1 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Figure 2 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Figure 3 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Figure 4 for TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Viaarxiv icon

Contrastive Localized Language-Image Pre-Training

Add code
Oct 03, 2024
Viaarxiv icon

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Add code
Oct 03, 2024
Viaarxiv icon

MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval

Add code
Aug 20, 2024
Figure 1 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Figure 2 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Figure 3 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Figure 4 for MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Viaarxiv icon

Apple Intelligence Foundation Language Models

Add code
Jul 29, 2024
Figure 1 for Apple Intelligence Foundation Language Models
Figure 2 for Apple Intelligence Foundation Language Models
Figure 3 for Apple Intelligence Foundation Language Models
Figure 4 for Apple Intelligence Foundation Language Models
Viaarxiv icon

SLRL: Structured Latent Representation Learning for Multi-view Clustering

Add code
Jul 11, 2024
Viaarxiv icon