Picture for Zhihong Chen

Zhihong Chen

RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models

Add code
Nov 06, 2024
Figure 1 for RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
Figure 2 for RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
Figure 3 for RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
Figure 4 for RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
Viaarxiv icon

Preference Fine-Tuning for Factuality in Chest X-Ray Interpretation Models Without Human Feedback

Add code
Oct 09, 2024
Figure 1 for Preference Fine-Tuning for Factuality in Chest X-Ray Interpretation Models Without Human Feedback
Figure 2 for Preference Fine-Tuning for Factuality in Chest X-Ray Interpretation Models Without Human Feedback
Figure 3 for Preference Fine-Tuning for Factuality in Chest X-Ray Interpretation Models Without Human Feedback
Figure 4 for Preference Fine-Tuning for Factuality in Chest X-Ray Interpretation Models Without Human Feedback
Viaarxiv icon

Overview of the First Shared Task on Clinical Text Generation: RRG24 and "Discharge Me!"

Add code
Sep 25, 2024
Viaarxiv icon

Merlin: A Vision Language Foundation Model for 3D Computed Tomography

Add code
Jun 10, 2024
Figure 1 for Merlin: A Vision Language Foundation Model for 3D Computed Tomography
Figure 2 for Merlin: A Vision Language Foundation Model for 3D Computed Tomography
Figure 3 for Merlin: A Vision Language Foundation Model for 3D Computed Tomography
Figure 4 for Merlin: A Vision Language Foundation Model for 3D Computed Tomography
Viaarxiv icon

CheXpert Plus: Augmenting a Large Chest X-ray Dataset with Text Radiology Reports, Patient Demographics and Additional Image Formats

Add code
Jun 03, 2024
Viaarxiv icon

GREEN: Generative Radiology Report Evaluation and Error Notation

Add code
May 06, 2024
Viaarxiv icon

Large Multimodal Agents: A Survey

Add code
Feb 23, 2024
Figure 1 for Large Multimodal Agents: A Survey
Figure 2 for Large Multimodal Agents: A Survey
Figure 3 for Large Multimodal Agents: A Survey
Figure 4 for Large Multimodal Agents: A Survey
Viaarxiv icon

ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model

Add code
Feb 18, 2024
Figure 1 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Figure 2 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Figure 3 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Figure 4 for ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model
Viaarxiv icon

CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation

Add code
Jan 22, 2024
Figure 1 for CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation
Figure 2 for CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation
Figure 3 for CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation
Figure 4 for CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation
Viaarxiv icon

MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V

Add code
Nov 23, 2023
Figure 1 for MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V
Figure 2 for MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V
Figure 3 for MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V
Figure 4 for MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V
Viaarxiv icon