Picture for Ming Yan

Ming Yan

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Add code
Nov 17, 2024
Viaarxiv icon

Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning

Add code
Oct 30, 2024
Figure 1 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning
Figure 2 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning
Figure 3 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning
Figure 4 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning
Viaarxiv icon

Double Banking on Knowledge: Customized Modulation and Prototypes for Multi-Modality Semi-supervised Medical Image Segmentation

Add code
Oct 23, 2024
Figure 1 for Double Banking on Knowledge: Customized Modulation and Prototypes for Multi-Modality Semi-supervised Medical Image Segmentation
Figure 2 for Double Banking on Knowledge: Customized Modulation and Prototypes for Multi-Modality Semi-supervised Medical Image Segmentation
Figure 3 for Double Banking on Knowledge: Customized Modulation and Prototypes for Multi-Modality Semi-supervised Medical Image Segmentation
Figure 4 for Double Banking on Knowledge: Customized Modulation and Prototypes for Multi-Modality Semi-supervised Medical Image Segmentation
Viaarxiv icon

SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing

Add code
Sep 16, 2024
Viaarxiv icon

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Add code
Sep 05, 2024
Figure 1 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 2 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 3 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 4 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Viaarxiv icon

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

Add code
Aug 26, 2024
Viaarxiv icon

ProFuser: Progressive Fusion of Large Language Models

Add code
Aug 09, 2024
Figure 1 for ProFuser: Progressive Fusion of Large Language Models
Figure 2 for ProFuser: Progressive Fusion of Large Language Models
Figure 3 for ProFuser: Progressive Fusion of Large Language Models
Figure 4 for ProFuser: Progressive Fusion of Large Language Models
Viaarxiv icon

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Add code
Aug 09, 2024
Viaarxiv icon

MIBench: Evaluating Multimodal Large Language Models over Multiple Images

Add code
Jul 21, 2024
Figure 1 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Figure 2 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Figure 3 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Figure 4 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Viaarxiv icon

Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models

Add code
Jul 19, 2024
Viaarxiv icon