Picture for Zhihao Fan

Zhihao Fan

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation

Add code
Oct 31, 2024
Viaarxiv icon

MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration

Add code
Oct 06, 2024
Viaarxiv icon

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Add code
Sep 18, 2024
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon

From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking

Add code
Jun 21, 2024
Viaarxiv icon

MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

Add code
May 08, 2024
Figure 1 for MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results
Figure 2 for MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results
Figure 3 for MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results
Figure 4 for MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results
Viaarxiv icon

DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning

Add code
Apr 02, 2024
Viaarxiv icon

AI Hospital: Interactive Evaluation and Collaboration of LLMs as Intern Doctors for Clinical Diagnosis

Add code
Feb 21, 2024
Viaarxiv icon

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

Add code
Feb 18, 2024
Viaarxiv icon

ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks

Add code
Oct 17, 2023
Viaarxiv icon