Picture for Gang Zhang

Gang Zhang

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

Add code
Oct 23, 2024
Viaarxiv icon

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Add code
Oct 17, 2024
Figure 1 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 2 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 3 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 4 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Viaarxiv icon

Add-SD: Rational Generation without Manual Reference

Add code
Jul 30, 2024
Figure 1 for Add-SD: Rational Generation without Manual Reference
Figure 2 for Add-SD: Rational Generation without Manual Reference
Figure 3 for Add-SD: Rational Generation without Manual Reference
Figure 4 for Add-SD: Rational Generation without Manual Reference
Viaarxiv icon

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction

Add code
Jul 16, 2024
Viaarxiv icon

OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer

Add code
Jul 15, 2024
Figure 1 for OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer
Figure 2 for OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer
Viaarxiv icon

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

Add code
Jun 05, 2024
Figure 1 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 2 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 3 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 4 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Viaarxiv icon

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Add code
Apr 16, 2024
Figure 1 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Figure 2 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Figure 3 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Figure 4 for The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Viaarxiv icon

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

Add code
Mar 09, 2024
Viaarxiv icon

FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View

Add code
Mar 05, 2024
Viaarxiv icon

VRP-SAM: SAM with Visual Reference Prompt

Add code
Feb 27, 2024
Figure 1 for VRP-SAM: SAM with Visual Reference Prompt
Figure 2 for VRP-SAM: SAM with Visual Reference Prompt
Figure 3 for VRP-SAM: SAM with Visual Reference Prompt
Figure 4 for VRP-SAM: SAM with Visual Reference Prompt
Viaarxiv icon