Picture for Shijia Yang

Shijia Yang

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Add code
Dec 03, 2024
Viaarxiv icon

Law of Vision Representation in MLLMs

Add code
Aug 29, 2024
Figure 1 for Law of Vision Representation in MLLMs
Figure 2 for Law of Vision Representation in MLLMs
Figure 3 for Law of Vision Representation in MLLMs
Figure 4 for Law of Vision Representation in MLLMs
Viaarxiv icon

HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption

Add code
Oct 03, 2023
Figure 1 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Figure 2 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Figure 3 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Figure 4 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Viaarxiv icon

Multitask Vision-Language Prompt Tuning

Add code
Dec 05, 2022
Viaarxiv icon

Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection

Add code
Oct 05, 2022
Figure 1 for Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Figure 2 for Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Figure 3 for Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Figure 4 for Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Viaarxiv icon

Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets

Add code
Jun 08, 2021
Figure 1 for Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets
Figure 2 for Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets
Figure 3 for Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets
Figure 4 for Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets
Viaarxiv icon