Picture for Yueqian Wang

Yueqian Wang

VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format

Add code
Nov 27, 2024
Viaarxiv icon

Understanding Multimodal Hallucination with Parameter-Free Representation Alignment

Add code
Sep 02, 2024
Viaarxiv icon

End-to-End Video Question Answering with Frame Scoring Mechanisms and Adaptive Sampling

Add code
Jul 23, 2024
Viaarxiv icon

VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

Add code
Jun 24, 2024
Figure 1 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Figure 2 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Figure 3 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Figure 4 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Viaarxiv icon

HawkEye: Training Video-Text LLMs for Grounding Text in Videos

Add code
Mar 15, 2024
Viaarxiv icon

LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding

Add code
Feb 25, 2024
Viaarxiv icon

STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering

Add code
Jan 08, 2024
Viaarxiv icon

VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions

Add code
May 30, 2023
Viaarxiv icon