Picture for Mengshi Qi

Mengshi Qi

Global-Local Monte Carlo Tree Search in Vision-Language Models for Text-to-3D Indoor Scene Generation

Add code
Jun 04, 2026
Viaarxiv icon

Active Exploring like a Pigeon: Reinforcing Spatial Reasoning via Agentic Vision-Language Models

Add code
Jun 01, 2026
Viaarxiv icon

Question-Aware Evidence Ledgers for Video Relational Reasoning

Add code
Jun 01, 2026
Viaarxiv icon

Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning

Add code
Dec 17, 2025
Figure 1 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Figure 2 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Figure 3 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Figure 4 for Explainable Action Form Assessment by Exploiting Multimodal Chain-of-Thoughts Reasoning
Viaarxiv icon

SoccerNet 2025 Challenges Results

Add code
Aug 26, 2025
Viaarxiv icon

Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization

Add code
Apr 18, 2025
Figure 1 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Figure 2 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Figure 3 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Figure 4 for Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Viaarxiv icon

Robo-SGG: Exploiting Layout-Oriented Normalization and Restitution for Robust Scene Graph Generation

Add code
Apr 17, 2025
Viaarxiv icon

DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency

Add code
Apr 16, 2025
Viaarxiv icon

Robust Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning

Add code
Feb 18, 2025
Viaarxiv icon

VLM-Assisted Continual learning for Visual Question Answering in Self-Driving

Add code
Feb 02, 2025
Figure 1 for VLM-Assisted Continual learning for Visual Question Answering in Self-Driving
Figure 2 for VLM-Assisted Continual learning for Visual Question Answering in Self-Driving
Figure 3 for VLM-Assisted Continual learning for Visual Question Answering in Self-Driving
Figure 4 for VLM-Assisted Continual learning for Visual Question Answering in Self-Driving
Viaarxiv icon