Picture for Zi-Yi Dou

Zi-Yi Dou

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

Add code
Oct 10, 2024
Viaarxiv icon

Unlocking Exocentric Video-Language Data for Egocentric Video Representation Learning

Add code
Aug 07, 2024
Viaarxiv icon

Reflection-Reinforced Self-Training for Language Agents

Add code
Jun 03, 2024
Viaarxiv icon

Matryoshka Query Transformer for Large Vision-Language Models

Add code
May 29, 2024
Viaarxiv icon

Medical Vision-Language Pre-Training for Brain Abnormalities

Add code
Apr 27, 2024
Viaarxiv icon

VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models

Add code
Apr 22, 2024
Viaarxiv icon

ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos

Add code
Nov 02, 2023
Figure 1 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Figure 2 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Figure 3 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Figure 4 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Viaarxiv icon

DesCo: Learning Object Recognition with Rich Language Descriptions

Add code
Jun 24, 2023
Figure 1 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 2 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 3 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 4 for DesCo: Learning Object Recognition with Rich Language Descriptions
Viaarxiv icon

Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

Add code
May 24, 2023
Figure 1 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Figure 2 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Figure 3 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Figure 4 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Viaarxiv icon

Masked Path Modeling for Vision-and-Language Navigation

Add code
May 23, 2023
Viaarxiv icon