Picture for Jinqiao Wang

Jinqiao Wang

Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, objecteye.Inc

PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

Add code
Mar 13, 2025
Viaarxiv icon

LightPlanner: Unleashing the Reasoning Capabilities of Lightweight Large Language Models in Task Planning

Add code
Mar 11, 2025
Viaarxiv icon

Synthetic Data is an Elegant GIFT for Continual Vision-Language Models

Add code
Mar 06, 2025
Viaarxiv icon

FLARE: A Framework for Stellar Flare Forecasting using Stellar Physical Properties and Historical Records

Add code
Feb 25, 2025
Viaarxiv icon

A Benchmark for Crime Surveillance Video Analysis with Large Models

Add code
Feb 13, 2025
Viaarxiv icon

Systematic Outliers in Large Language Models

Add code
Feb 10, 2025
Viaarxiv icon

MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark

Add code
Jan 28, 2025
Viaarxiv icon

FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization

Add code
Jan 17, 2025
Viaarxiv icon

LINK: Adaptive Modality Interaction for Audio-Visual Video Parsing

Add code
Dec 30, 2024
Figure 1 for LINK: Adaptive Modality Interaction for Audio-Visual Video Parsing
Figure 2 for LINK: Adaptive Modality Interaction for Audio-Visual Video Parsing
Figure 3 for LINK: Adaptive Modality Interaction for Audio-Visual Video Parsing
Figure 4 for LINK: Adaptive Modality Interaction for Audio-Visual Video Parsing
Viaarxiv icon

Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence

Add code
Dec 18, 2024
Figure 1 for Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Figure 2 for Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Figure 3 for Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Figure 4 for Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Viaarxiv icon