Picture for Mei Gao

Mei Gao

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Viaarxiv icon

Benchmarking Large and Small MLLMs

Add code
Jan 04, 2025
Figure 1 for Benchmarking Large and Small MLLMs
Figure 2 for Benchmarking Large and Small MLLMs
Figure 3 for Benchmarking Large and Small MLLMs
Figure 4 for Benchmarking Large and Small MLLMs
Viaarxiv icon

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

Add code
May 21, 2023
Figure 1 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 2 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 3 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 4 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Viaarxiv icon

i-Code: An Integrative and Composable Multimodal Learning Framework

Add code
May 05, 2022
Figure 1 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 2 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 3 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 4 for i-Code: An Integrative and Composable Multimodal Learning Framework
Viaarxiv icon

Efficient Self-supervised Vision Transformers for Representation Learning

Add code
Jun 17, 2021
Figure 1 for Efficient Self-supervised Vision Transformers for Representation Learning
Figure 2 for Efficient Self-supervised Vision Transformers for Representation Learning
Figure 3 for Efficient Self-supervised Vision Transformers for Representation Learning
Figure 4 for Efficient Self-supervised Vision Transformers for Representation Learning
Viaarxiv icon