Picture for Shaohui Lin

Shaohui Lin

National University of Singapore

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Add code
Mar 11, 2025
Viaarxiv icon

LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition?

Add code
Mar 10, 2025
Viaarxiv icon

Autoregressive Image Generation Guided by Chains of Thought

Add code
Feb 26, 2025
Viaarxiv icon

Probability-density-aware Semi-supervised Learning

Add code
Dec 23, 2024
Viaarxiv icon

Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration

Add code
Dec 12, 2024
Figure 1 for Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Figure 2 for Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Figure 3 for Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Figure 4 for Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Viaarxiv icon

Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification

Add code
Dec 03, 2024
Figure 1 for Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Figure 2 for Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Figure 3 for Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Figure 4 for Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Viaarxiv icon

Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution

Add code
Oct 14, 2024
Figure 1 for Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
Figure 2 for Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
Figure 3 for Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
Figure 4 for Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
Viaarxiv icon

HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection

Add code
Jun 27, 2024
Viaarxiv icon

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Add code
May 31, 2024
Figure 1 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 2 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 3 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 4 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Viaarxiv icon

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Add code
Apr 16, 2024
Figure 1 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Figure 2 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Figure 3 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Figure 4 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Viaarxiv icon