Picture for Junfeng Luo

Junfeng Luo

Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

Add code
Mar 18, 2025
Viaarxiv icon

A Token-level Text Image Foundation Model for Document Understanding

Add code
Mar 04, 2025
Viaarxiv icon

Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review

Add code
Feb 23, 2025
Viaarxiv icon

InstructOCR: Instruction Boosting Scene Text Spotting

Add code
Dec 20, 2024
Viaarxiv icon

Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation

Add code
Oct 21, 2024
Figure 1 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Figure 2 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Figure 3 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Figure 4 for Focus on BEV: Self-calibrated Cycle View Transformation for Monocular Birds-Eye-View Segmentation
Viaarxiv icon

Text2Street: Controllable Text-to-image Generation for Street Views

Add code
Feb 07, 2024
Viaarxiv icon

3rd Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation

Add code
Jun 11, 2023
Viaarxiv icon

Perceive, Excavate and Purify: A Novel Object Mining Framework for Instance Segmentation

Add code
Apr 18, 2023
Viaarxiv icon

Motion-state Alignment for Video Semantic Segmentation

Add code
Apr 18, 2023
Viaarxiv icon

5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation

Add code
Jun 20, 2022
Figure 1 for 5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation
Figure 2 for 5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation
Figure 3 for 5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation
Figure 4 for 5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation
Viaarxiv icon