Picture for Hongsheng Li

Hongsheng Li

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

Add code
Sep 11, 2025
Viaarxiv icon

GLEAM: Learning to Match and Explain in Cross-View Geo-Localization

Add code
Sep 09, 2025
Viaarxiv icon

One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning

Add code
Sep 09, 2025
Viaarxiv icon

Alignment with Fill-In-the-Middle for Enhancing Code Generation

Add code
Aug 27, 2025
Viaarxiv icon

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Add code
Aug 13, 2025
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Viaarxiv icon

VBCD: A Voxel-Based Framework for Personalized Dental Crown Design

Add code
Jul 23, 2025
Viaarxiv icon

Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation

Add code
Jul 17, 2025
Viaarxiv icon

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Add code
Jun 05, 2025
Viaarxiv icon

MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Add code
Jun 05, 2025
Viaarxiv icon