Picture for Huawen Shen

Huawen Shen

Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts

Add code
Dec 27, 2024
Viaarxiv icon

LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining

Add code
Dec 19, 2024
Viaarxiv icon

Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues

Add code
Dec 17, 2024
Figure 1 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 2 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 3 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Figure 4 for Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Viaarxiv icon

Falcon-UI: Understanding GUI Before Following User Instructions

Add code
Dec 12, 2024
Viaarxiv icon

Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition

Add code
Jul 09, 2024
Figure 1 for Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
Figure 2 for Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
Figure 3 for Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
Figure 4 for Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
Viaarxiv icon