Picture for Jingye Chen

Jingye Chen

TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization

Add code
Aug 07, 2024
Figure 1 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Figure 2 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Figure 3 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Figure 4 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Viaarxiv icon

LLMs Meet Multimodal Generation and Editing: A Survey

Add code
May 29, 2024
Viaarxiv icon

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

Add code
Nov 28, 2023
Figure 1 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 2 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 3 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 4 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Viaarxiv icon

Kosmos-2.5: A Multimodal Literate Model

Add code
Sep 20, 2023
Viaarxiv icon

TextDiffuser: Diffusion Models as Text Painters

Add code
May 24, 2023
Viaarxiv icon

Chinese Character Recognition with Radical-Structured Stroke Trees

Add code
Nov 24, 2022
Figure 1 for Chinese Character Recognition with Radical-Structured Stroke Trees
Figure 2 for Chinese Character Recognition with Radical-Structured Stroke Trees
Figure 3 for Chinese Character Recognition with Radical-Structured Stroke Trees
Figure 4 for Chinese Character Recognition with Radical-Structured Stroke Trees
Viaarxiv icon

XDoc: Unified Pre-training for Cross-Format Document Understanding

Add code
Oct 06, 2022
Figure 1 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 2 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 3 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 4 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Viaarxiv icon

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Add code
Dec 30, 2021
Figure 1 for Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
Figure 2 for Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
Figure 3 for Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
Figure 4 for Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
Viaarxiv icon

Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

Add code
Dec 13, 2021
Figure 1 for Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
Figure 2 for Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
Figure 3 for Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
Figure 4 for Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
Viaarxiv icon

MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification

Add code
Dec 03, 2021
Figure 1 for MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification
Figure 2 for MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification
Figure 3 for MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification
Figure 4 for MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification
Viaarxiv icon