Picture for Mingxin Huang

Mingxin Huang

Creating a Microstructure Latent Space with Rich Material Information for Multiphase Alloy Design

Add code
Sep 04, 2024
Viaarxiv icon

Mini-Monkey: Multi-Scale Adaptive Cropping for Multimodal Large Language Models

Add code
Aug 09, 2024
Viaarxiv icon

Mini-Monkey: Alleviate the Sawtooth Effect by Multi-Scale Adaptive Cropping

Add code
Aug 04, 2024
Viaarxiv icon

VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization

Add code
Apr 30, 2024
Viaarxiv icon

Bridging the Gap Between End-to-End and Two-Step Text Spotting

Add code
Apr 06, 2024
Viaarxiv icon

SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting

Add code
Jan 15, 2024
Viaarxiv icon

Progressive Evolution from Single-Point to Polygon for Scene Text

Add code
Dec 21, 2023
Viaarxiv icon

Hierarchical Side-Tuning for Vision Transformers

Add code
Oct 10, 2023
Figure 1 for Hierarchical Side-Tuning for Vision Transformers
Figure 2 for Hierarchical Side-Tuning for Vision Transformers
Figure 3 for Hierarchical Side-Tuning for Vision Transformers
Figure 4 for Hierarchical Side-Tuning for Vision Transformers
Viaarxiv icon

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

Add code
Aug 20, 2023
Viaarxiv icon

On the Hidden Mystery of OCR in Large Multimodal Models

Add code
May 13, 2023
Viaarxiv icon