Picture for Zhouhui Lian

Zhouhui Lian

UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images

Add code
Dec 23, 2025
Viaarxiv icon

IndoorUAV: Benchmarking Vision-Language UAV Navigation in Continuous Indoor Environments

Add code
Dec 22, 2025
Viaarxiv icon

MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning

Add code
Jun 12, 2025
Viaarxiv icon

TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis

Add code
May 23, 2025
Figure 1 for TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
Figure 2 for TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
Figure 3 for TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
Figure 4 for TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
Viaarxiv icon

Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting

Add code
Apr 29, 2025
Figure 1 for Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Figure 2 for Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Figure 3 for Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Figure 4 for Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Viaarxiv icon

CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model

Add code
Mar 09, 2025
Figure 1 for CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model
Figure 2 for CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model
Figure 3 for CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model
Figure 4 for CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model
Viaarxiv icon

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Add code
Feb 25, 2025
Viaarxiv icon

RSUniVLM: A Unified Vision Language Model for Remote Sensing via Granularity-oriented Mixture of Experts

Add code
Dec 10, 2024
Viaarxiv icon

TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

Add code
Nov 29, 2024
Figure 1 for TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting
Figure 2 for TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting
Figure 3 for TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting
Figure 4 for TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting
Viaarxiv icon

HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

Add code
Oct 09, 2024
Viaarxiv icon