Picture for Qingyi Si

Qingyi Si

ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding

Add code
Dec 29, 2024
Viaarxiv icon

Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering

Add code
Dec 19, 2024
Viaarxiv icon

A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles

Add code
Nov 04, 2024
Figure 1 for A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles
Figure 2 for A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles
Figure 3 for A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles
Figure 4 for A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles
Viaarxiv icon

Towards Flexible Evaluation for Generative Visual Question Answering

Add code
Aug 01, 2024
Figure 1 for Towards Flexible Evaluation for Generative Visual Question Answering
Figure 2 for Towards Flexible Evaluation for Generative Visual Question Answering
Figure 3 for Towards Flexible Evaluation for Generative Visual Question Answering
Figure 4 for Towards Flexible Evaluation for Generative Visual Question Answering
Viaarxiv icon

Multimodal Table Understanding

Add code
Jun 12, 2024
Figure 1 for Multimodal Table Understanding
Figure 2 for Multimodal Table Understanding
Figure 3 for Multimodal Table Understanding
Figure 4 for Multimodal Table Understanding
Viaarxiv icon

Think out Loud: Emotion Deducing Explanation in Dialogues

Add code
Jun 07, 2024
Viaarxiv icon

Are Large Language Models Table-based Fact-Checkers?

Add code
Feb 04, 2024
Viaarxiv icon

Towards Unified Interactive Visual Grounding in The Wild

Add code
Jan 30, 2024
Figure 1 for Towards Unified Interactive Visual Grounding in The Wild
Figure 2 for Towards Unified Interactive Visual Grounding in The Wild
Figure 3 for Towards Unified Interactive Visual Grounding in The Wild
Figure 4 for Towards Unified Interactive Visual Grounding in The Wild
Viaarxiv icon

An Empirical Study of Instruction-tuning Large Language Models in Chinese

Add code
Oct 20, 2023
Viaarxiv icon

Combo of Thinking and Observing for Outside-Knowledge VQA

Add code
May 10, 2023
Viaarxiv icon