Picture for Haonan Lu

Haonan Lu

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Add code
Apr 01, 2025
Viaarxiv icon

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

Add code
Apr 01, 2025
Viaarxiv icon

H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding

Add code
Mar 31, 2025
Viaarxiv icon

Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens

Add code
Mar 12, 2025
Viaarxiv icon

X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation

Add code
Mar 08, 2025
Viaarxiv icon

GenX: Mastering Code and Test Generation with Execution Feedback

Add code
Dec 18, 2024
Figure 1 for GenX: Mastering Code and Test Generation with Execution Feedback
Figure 2 for GenX: Mastering Code and Test Generation with Execution Feedback
Figure 3 for GenX: Mastering Code and Test Generation with Execution Feedback
Figure 4 for GenX: Mastering Code and Test Generation with Execution Feedback
Viaarxiv icon

PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Add code
Dec 02, 2024
Viaarxiv icon

HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator

Add code
Nov 26, 2024
Figure 1 for HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Figure 2 for HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Figure 3 for HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Figure 4 for HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Viaarxiv icon

LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image

Add code
Aug 14, 2024
Figure 1 for LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image
Figure 2 for LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image
Figure 3 for LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image
Figure 4 for LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image
Viaarxiv icon

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Add code
Jul 02, 2024
Viaarxiv icon