Picture for Jianshu Zhang

Jianshu Zhang

May

VLM$^2$-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Add code
Feb 17, 2025
Viaarxiv icon

Fast Adaptive Anti-Jamming Channel Access via Deep Q Learning and Coarse-Grained Spectrum Prediction

Add code
Feb 07, 2025
Viaarxiv icon

Skeleton and Font Generation Network for Zero-shot Chinese Character Generation

Add code
Jan 14, 2025
Viaarxiv icon

Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code

Add code
Oct 24, 2024
Figure 1 for Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Figure 2 for Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Figure 3 for Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Figure 4 for Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Viaarxiv icon

Personalized Visual Instruction Tuning

Add code
Oct 09, 2024
Figure 1 for Personalized Visual Instruction Tuning
Figure 2 for Personalized Visual Instruction Tuning
Figure 3 for Personalized Visual Instruction Tuning
Figure 4 for Personalized Visual Instruction Tuning
Viaarxiv icon

See then Tell: Enhancing Key Information Extraction with Vision Grounding

Add code
Sep 29, 2024
Figure 1 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 2 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 3 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 4 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Viaarxiv icon

DocMamba: Efficient Document Pre-training with State Space Model

Add code
Sep 18, 2024
Figure 1 for DocMamba: Efficient Document Pre-training with State Space Model
Figure 2 for DocMamba: Efficient Document Pre-training with State Space Model
Figure 3 for DocMamba: Efficient Document Pre-training with State Space Model
Figure 4 for DocMamba: Efficient Document Pre-training with State Space Model
Viaarxiv icon

FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation

Add code
Aug 22, 2024
Viaarxiv icon

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

Add code
Jun 13, 2024
Viaarxiv icon

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

Add code
Jun 11, 2024
Viaarxiv icon