Picture for Jingkang Yang

Jingkang Yang

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Add code
Nov 21, 2024
Viaarxiv icon

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Add code
Jul 31, 2024
Figure 1 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 2 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 3 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 4 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Viaarxiv icon

LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models

Add code
Jul 17, 2024
Viaarxiv icon

Long Context Transfer from Language to Vision

Add code
Jun 24, 2024
Figure 1 for Long Context Transfer from Language to Vision
Figure 2 for Long Context Transfer from Language to Vision
Figure 3 for Long Context Transfer from Language to Vision
Figure 4 for Long Context Transfer from Language to Vision
Viaarxiv icon

4D Panoptic Scene Graph Generation

Add code
May 16, 2024
Figure 1 for 4D Panoptic Scene Graph Generation
Figure 2 for 4D Panoptic Scene Graph Generation
Figure 3 for 4D Panoptic Scene Graph Generation
Figure 4 for 4D Panoptic Scene Graph Generation
Viaarxiv icon

WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning

Add code
May 06, 2024
Viaarxiv icon

Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models

Add code
Mar 29, 2024
Viaarxiv icon

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

Add code
Jan 18, 2024
Figure 1 for Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Figure 2 for Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Figure 3 for Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Figure 4 for Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Viaarxiv icon

Panoptic Video Scene Graph Generation

Add code
Nov 28, 2023
Viaarxiv icon

OtterHD: A High-Resolution Multi-modality Model

Add code
Nov 07, 2023
Viaarxiv icon