Picture for Jifeng Dai

Jifeng Dai

DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model

Add code
Oct 22, 2024
Viaarxiv icon

Diffusion Transformer Policy

Add code
Oct 21, 2024
Viaarxiv icon

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

Add code
Oct 21, 2024
Viaarxiv icon

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Add code
Oct 17, 2024
Viaarxiv icon

big.LITTLE Vision Transformer for Efficient Visual Recognition

Add code
Oct 14, 2024
Viaarxiv icon

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

Add code
Oct 10, 2024
Viaarxiv icon

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Add code
Aug 05, 2024
Figure 1 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Figure 2 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Figure 3 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Figure 4 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Viaarxiv icon

MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity

Add code
Jul 22, 2024
Viaarxiv icon

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Add code
Jul 03, 2024
Figure 1 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 2 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 3 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 4 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Viaarxiv icon

Hierarchical Memory for Long Video QA

Add code
Jun 30, 2024
Viaarxiv icon