Picture for Zhida Huang

Zhida Huang

UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Add code
Aug 05, 2024
Figure 1 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 2 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 3 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 4 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Viaarxiv icon

GroundingGPT:Language Enhanced Multi-modal Grounding Model

Add code
Jan 30, 2024
Viaarxiv icon

MagFace: A Universal Representation for Face Recognition and Quality Assessment

Add code
Apr 03, 2021
Figure 1 for MagFace: A Universal Representation for Face Recognition and Quality Assessment
Figure 2 for MagFace: A Universal Representation for Face Recognition and Quality Assessment
Figure 3 for MagFace: A Universal Representation for Face Recognition and Quality Assessment
Figure 4 for MagFace: A Universal Representation for Face Recognition and Quality Assessment
Viaarxiv icon

Visible Feature Guidance for Crowd Pedestrian Detection

Add code
Sep 16, 2020
Figure 1 for Visible Feature Guidance for Crowd Pedestrian Detection
Figure 2 for Visible Feature Guidance for Crowd Pedestrian Detection
Figure 3 for Visible Feature Guidance for Crowd Pedestrian Detection
Figure 4 for Visible Feature Guidance for Crowd Pedestrian Detection
Viaarxiv icon

Mask R-CNN with Pyramid Attention Network for Scene Text Detection

Add code
Nov 22, 2018
Figure 1 for Mask R-CNN with Pyramid Attention Network for Scene Text Detection
Figure 2 for Mask R-CNN with Pyramid Attention Network for Scene Text Detection
Figure 3 for Mask R-CNN with Pyramid Attention Network for Scene Text Detection
Figure 4 for Mask R-CNN with Pyramid Attention Network for Scene Text Detection
Viaarxiv icon