Picture for Zhe Chen

Zhe Chen

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

Add code
Oct 21, 2024
Viaarxiv icon

MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding

Add code
Oct 15, 2024
Viaarxiv icon

Tracing Human Stress from Physiological Signals using UWB Radar

Add code
Oct 14, 2024
Viaarxiv icon

t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving

Add code
Oct 13, 2024
Viaarxiv icon

Learning to Discover Generalized Facial Expressions

Add code
Sep 30, 2024
Figure 1 for Learning to Discover Generalized Facial Expressions
Figure 2 for Learning to Discover Generalized Facial Expressions
Figure 3 for Learning to Discover Generalized Facial Expressions
Figure 4 for Learning to Discover Generalized Facial Expressions
Viaarxiv icon

Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment

Add code
Aug 29, 2024
Viaarxiv icon

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Add code
Aug 16, 2024
Viaarxiv icon

Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM

Add code
Aug 14, 2024
Viaarxiv icon

ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning

Add code
Aug 04, 2024
Viaarxiv icon

MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity

Add code
Jul 22, 2024
Viaarxiv icon