Picture for Weisi Lin

Weisi Lin

Context-Aware Deep Learning for Multi Modal Depression Detection

Add code
Dec 26, 2024
Viaarxiv icon

VQA$^2$:Visual Question Answering for Video Quality Assessment

Add code
Nov 06, 2024
Figure 1 for VQA$^2$:Visual Question Answering for Video Quality Assessment
Figure 2 for VQA$^2$:Visual Question Answering for Video Quality Assessment
Figure 3 for VQA$^2$:Visual Question Answering for Video Quality Assessment
Figure 4 for VQA$^2$:Visual Question Answering for Video Quality Assessment
Viaarxiv icon

R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?

Add code
Oct 07, 2024
Figure 1 for R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?
Figure 2 for R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?
Figure 3 for R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?
Figure 4 for R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?
Viaarxiv icon

Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs

Add code
Sep 30, 2024
Figure 1 for Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs
Figure 2 for Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs
Figure 3 for Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs
Figure 4 for Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs
Viaarxiv icon

Explore the Hallucination on Low-level Perception for MLLMs

Add code
Sep 15, 2024
Figure 1 for Explore the Hallucination on Low-level Perception for MLLMs
Figure 2 for Explore the Hallucination on Low-level Perception for MLLMs
Figure 3 for Explore the Hallucination on Low-level Perception for MLLMs
Figure 4 for Explore the Hallucination on Low-level Perception for MLLMs
Viaarxiv icon

MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce

Add code
Aug 27, 2024
Viaarxiv icon

UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content

Add code
Jul 29, 2024
Figure 1 for UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
Figure 2 for UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
Figure 3 for UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
Figure 4 for UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
Viaarxiv icon

Q-Ground: Image Quality Grounding with Large Multi-modality Models

Add code
Jul 24, 2024
Viaarxiv icon

LPViT: Low-Power Semi-structured Pruning for Vision Transformers

Add code
Jul 02, 2024
Viaarxiv icon

DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection

Add code
Jul 02, 2024
Figure 1 for DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection
Figure 2 for DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection
Figure 3 for DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection
Figure 4 for DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection
Viaarxiv icon