Picture for Hang Zhang

Hang Zhang

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

Add code
Mar 27, 2025
Viaarxiv icon

Evaluating Bias in Retrieval-Augmented Medical Question-Answering Systems

Add code
Mar 19, 2025
Viaarxiv icon

Think Twice, Click Once: Enhancing GUI Grounding via Fast and Slow Systems

Add code
Mar 09, 2025
Viaarxiv icon

Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework

Add code
Feb 21, 2025
Viaarxiv icon

Qwen2.5-VL Technical Report

Add code
Feb 19, 2025
Viaarxiv icon

Deep Semantic Graph Learning via LLM based Node Enhancement

Add code
Feb 11, 2025
Viaarxiv icon

Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT

Add code
Feb 08, 2025
Figure 1 for Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT
Figure 2 for Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT
Figure 3 for Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT
Figure 4 for Zero-Shot End-to-End Relation Extraction in Chinese: A Comparative Study of Gemini, LLaMA and ChatGPT
Viaarxiv icon

Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare

Add code
Jan 27, 2025
Figure 1 for Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare
Figure 2 for Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare
Figure 3 for Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare
Figure 4 for Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare
Viaarxiv icon

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Add code
Jan 08, 2025
Viaarxiv icon

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Add code
Jan 03, 2025
Figure 1 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 2 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 3 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 4 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Viaarxiv icon