Picture for Tong Lu

Tong Lu

Driving with InternVL: Oustanding Champion in the Track on Driving with Language of the Autonomous Grand Challenge at CVPR 2024

Add code
Dec 10, 2024
Viaarxiv icon

LLM-Align: Utilizing Large Language Models for Entity Alignment in Knowledge Graphs

Add code
Dec 06, 2024
Viaarxiv icon

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Add code
Dec 06, 2024
Viaarxiv icon

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

Add code
Oct 21, 2024
Figure 1 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Figure 2 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Figure 3 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Figure 4 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Viaarxiv icon

MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding

Add code
Oct 15, 2024
Figure 1 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 2 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 3 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 4 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Viaarxiv icon

CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning

Add code
Aug 15, 2024
Figure 1 for CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning
Figure 2 for CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning
Figure 3 for CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning
Figure 4 for CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning
Viaarxiv icon

EAR: Edge-Aware Reconstruction of 3-D vertebrae structures from bi-planar X-ray images

Add code
Jul 30, 2024
Viaarxiv icon

MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity

Add code
Jul 22, 2024
Figure 1 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Figure 2 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Figure 3 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Figure 4 for MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Viaarxiv icon

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation

Add code
Jun 27, 2024
Viaarxiv icon

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 13, 2024
Figure 1 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon