Picture for Kun Yao

Kun Yao

Add-SD: Rational Generation without Manual Reference

Add code
Jul 30, 2024
Figure 1 for Add-SD: Rational Generation without Manual Reference
Figure 2 for Add-SD: Rational Generation without Manual Reference
Figure 3 for Add-SD: Rational Generation without Manual Reference
Figure 4 for Add-SD: Rational Generation without Manual Reference
Viaarxiv icon

OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer

Add code
Jul 15, 2024
Figure 1 for OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer
Figure 2 for OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer
Viaarxiv icon

Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting

Add code
Jun 13, 2024
Viaarxiv icon

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

Add code
Jun 05, 2024
Figure 1 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 2 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 3 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 4 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Viaarxiv icon

StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond

Add code
Jun 04, 2024
Viaarxiv icon

Towards Unified Multi-granularity Text Detection with Interactive Attention

Add code
May 30, 2024
Viaarxiv icon

FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition

Add code
Feb 05, 2024
Viaarxiv icon

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception

Add code
Oct 31, 2023
Viaarxiv icon

GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction

Add code
Sep 26, 2023
Figure 1 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Figure 2 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Figure 3 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Figure 4 for GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction
Viaarxiv icon

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation

Add code
Aug 14, 2023
Viaarxiv icon