Picture for Shan Jiang

Shan Jiang

Target-aware Bidirectional Fusion Transformer for Aerial Object Tracking

Add code
Mar 13, 2025
Viaarxiv icon

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

Add code
Feb 22, 2025
Viaarxiv icon

Real-Time Neural-Enhancement for Online Cloud Gaming

Add code
Jan 12, 2025
Figure 1 for Real-Time Neural-Enhancement for Online Cloud Gaming
Figure 2 for Real-Time Neural-Enhancement for Online Cloud Gaming
Figure 3 for Real-Time Neural-Enhancement for Online Cloud Gaming
Figure 4 for Real-Time Neural-Enhancement for Online Cloud Gaming
Viaarxiv icon

Distributed satellite information networks: Architecture, enabling technologies, and trends

Add code
Dec 17, 2024
Figure 1 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Figure 2 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Figure 3 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Figure 4 for Distributed satellite information networks: Architecture, enabling technologies, and trends
Viaarxiv icon

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

Add code
Jun 27, 2024
Figure 1 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Figure 2 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Figure 3 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Figure 4 for Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Viaarxiv icon

PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding

Add code
Jun 12, 2024
Figure 1 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Figure 2 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Figure 3 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Figure 4 for PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Viaarxiv icon

Multi-attention Associate Prediction Network for Visual Tracking

Add code
Mar 25, 2024
Viaarxiv icon

A Clustering Method with Graph Maximum Decoding Information

Add code
Mar 18, 2024
Viaarxiv icon

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA

Add code
Jan 29, 2024
Figure 1 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Figure 2 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Figure 3 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Figure 4 for Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Viaarxiv icon

Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology

Add code
Oct 31, 2023
Figure 1 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Figure 2 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Figure 3 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Figure 4 for Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Viaarxiv icon