Picture for Hongtao Xie

Hongtao Xie

A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions

Add code
Dec 12, 2024
Viaarxiv icon

SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition

Add code
Nov 24, 2024
Viaarxiv icon

Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing

Add code
Nov 23, 2024
Figure 1 for Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
Figure 2 for Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
Figure 3 for Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
Figure 4 for Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
Viaarxiv icon

TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model

Add code
Oct 14, 2024
Figure 1 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 2 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 3 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 4 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Viaarxiv icon

How Control Information Influences Multilingual Text Image Generation and Editing?

Add code
Jul 16, 2024
Viaarxiv icon

Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition

Add code
Jul 08, 2024
Viaarxiv icon

Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation

Add code
Jun 21, 2024
Figure 1 for Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation
Figure 2 for Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation
Figure 3 for Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation
Figure 4 for Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation
Viaarxiv icon

Hallucination Mitigation Prompts Long-term Video Understanding

Add code
Jun 17, 2024
Figure 1 for Hallucination Mitigation Prompts Long-term Video Understanding
Figure 2 for Hallucination Mitigation Prompts Long-term Video Understanding
Figure 3 for Hallucination Mitigation Prompts Long-term Video Understanding
Figure 4 for Hallucination Mitigation Prompts Long-term Video Understanding
Viaarxiv icon

DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

Add code
May 16, 2024
Viaarxiv icon

Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition

Add code
May 11, 2024
Viaarxiv icon