Picture for Jingye Chen

Jingye Chen

Model as a Game: On Numerical and Spatial Consistency for Generative Games

Add code
Mar 27, 2025
Viaarxiv icon

AvatarArtist: Open-Domain 4D Avatarization

Add code
Mar 26, 2025
Viaarxiv icon

Large Motion Video Autoencoding with Cross-modal Video VAE

Add code
Dec 23, 2024
Figure 1 for Large Motion Video Autoencoding with Cross-modal Video VAE
Figure 2 for Large Motion Video Autoencoding with Cross-modal Video VAE
Figure 3 for Large Motion Video Autoencoding with Cross-modal Video VAE
Figure 4 for Large Motion Video Autoencoding with Cross-modal Video VAE
Viaarxiv icon

TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization

Add code
Aug 07, 2024
Figure 1 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Figure 2 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Figure 3 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Figure 4 for TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
Viaarxiv icon

LLMs Meet Multimodal Generation and Editing: A Survey

Add code
May 29, 2024
Viaarxiv icon

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

Add code
Nov 28, 2023
Figure 1 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 2 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 3 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Figure 4 for TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Viaarxiv icon

Kosmos-2.5: A Multimodal Literate Model

Add code
Sep 20, 2023
Figure 1 for Kosmos-2.5: A Multimodal Literate Model
Figure 2 for Kosmos-2.5: A Multimodal Literate Model
Figure 3 for Kosmos-2.5: A Multimodal Literate Model
Figure 4 for Kosmos-2.5: A Multimodal Literate Model
Viaarxiv icon

TextDiffuser: Diffusion Models as Text Painters

Add code
May 24, 2023
Viaarxiv icon

Chinese Character Recognition with Radical-Structured Stroke Trees

Add code
Nov 24, 2022
Figure 1 for Chinese Character Recognition with Radical-Structured Stroke Trees
Figure 2 for Chinese Character Recognition with Radical-Structured Stroke Trees
Figure 3 for Chinese Character Recognition with Radical-Structured Stroke Trees
Figure 4 for Chinese Character Recognition with Radical-Structured Stroke Trees
Viaarxiv icon

XDoc: Unified Pre-training for Cross-Format Document Understanding

Add code
Oct 06, 2022
Figure 1 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 2 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 3 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Figure 4 for XDoc: Unified Pre-training for Cross-Format Document Understanding
Viaarxiv icon