Picture for Shan Jiang

Shan Jiang

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

Add code
Jun 27, 2024
Figure 1 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Figure 2 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Figure 3 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Figure 4 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Viaarxiv icon

PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding

Add code
Jun 12, 2024
Figure 1 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Figure 2 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Figure 3 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Figure 4 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Viaarxiv icon

Multi-attention Associate Prediction Network for Visual Tracking

Add code
Mar 25, 2024
Viaarxiv icon

A Clustering Method with Graph Maximum Decoding Information

Add code
Mar 18, 2024
Viaarxiv icon

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA

Add code
Jan 29, 2024
Figure 1 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Figure 2 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Figure 3 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Figure 4 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Viaarxiv icon

Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology

Add code
Oct 31, 2023
Figure 1 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Figure 2 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Figure 3 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Figure 4 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Viaarxiv icon

Online Video Streaming Super-Resolution with Adaptive Look-Up Table Fusion

Add code
Mar 01, 2023
Viaarxiv icon

A Unified Multi-view Multi-person Tracking Framework

Add code
Feb 08, 2023
Viaarxiv icon

The Second-place Solution for ECCV 2022 Multiple People Tracking in Group Dance Challenge

Add code
Dec 06, 2022
Viaarxiv icon

Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space

Add code
Dec 06, 2022
Figure 1 for Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space
Figure 2 for Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space
Figure 3 for Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space
Figure 4 for Hard to Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space
Viaarxiv icon