Picture for Xuejing Liu

Xuejing Liu

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

Add code
Dec 03, 2024
Viaarxiv icon

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Add code
Sep 18, 2024
Viaarxiv icon

SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding

Add code
Aug 27, 2024
Viaarxiv icon

What Makes Good Few-shot Examples for Vision-Language Models?

Add code
May 22, 2024
Figure 1 for What Makes Good Few-shot Examples for Vision-Language Models?
Figure 2 for What Makes Good Few-shot Examples for Vision-Language Models?
Figure 3 for What Makes Good Few-shot Examples for Vision-Language Models?
Figure 4 for What Makes Good Few-shot Examples for Vision-Language Models?
Viaarxiv icon

PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition

Add code
Feb 15, 2024
Figure 1 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Figure 2 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Figure 3 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Figure 4 for PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Viaarxiv icon

Context Disentangling and Prototype Inheriting for Robust Visual Grounding

Add code
Dec 19, 2023
Figure 1 for Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Figure 2 for Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Figure 3 for Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Figure 4 for Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Viaarxiv icon

What Large Language Models Bring to Text-rich VQA?

Add code
Nov 13, 2023
Viaarxiv icon

Deeply Coupled Cross-Modal Prompt Learning

Add code
May 30, 2023
Viaarxiv icon

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

Add code
Jul 18, 2022
Figure 1 for Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Figure 2 for Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Figure 3 for Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Figure 4 for Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Viaarxiv icon

Parsing-based View-aware Embedding Network for Vehicle Re-Identification

Add code
Apr 10, 2020
Figure 1 for Parsing-based View-aware Embedding Network for Vehicle Re-Identification
Figure 2 for Parsing-based View-aware Embedding Network for Vehicle Re-Identification
Figure 3 for Parsing-based View-aware Embedding Network for Vehicle Re-Identification
Figure 4 for Parsing-based View-aware Embedding Network for Vehicle Re-Identification
Viaarxiv icon