Picture for Lianwen Jin

Lianwen Jin

VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models

Add code
Oct 01, 2024
Viaarxiv icon

DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding

Add code
Aug 27, 2024
Viaarxiv icon

Mini-Monkey: Multi-Scale Adaptive Cropping for Multimodal Large Language Models

Add code
Aug 09, 2024
Viaarxiv icon

Mini-Monkey: Alleviate the Sawtooth Effect by Multi-Scale Adaptive Cropping

Add code
Aug 04, 2024
Viaarxiv icon

LEGO: Self-Supervised Representation Learning for Scene Text Images

Add code
Aug 04, 2024
Viaarxiv icon

Generalized Tampered Scene Text Detection in the era of Generative AI

Add code
Jul 31, 2024
Viaarxiv icon

TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models

Add code
Jul 04, 2024
Viaarxiv icon

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

Add code
Jun 27, 2024
Viaarxiv icon

Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction

Add code
Jun 05, 2024
Figure 1 for Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction
Figure 2 for Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction
Figure 3 for Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction
Figure 4 for Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction
Viaarxiv icon

Deciphering Oracle Bone Language with Diffusion Models

Add code
Jun 02, 2024
Viaarxiv icon