Picture for Qingsong Xie

Qingsong Xie

Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM

Add code
May 26, 2025
Figure 1 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Figure 2 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Figure 3 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Figure 4 for Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM
Viaarxiv icon

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Add code
May 22, 2025
Viaarxiv icon

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

Add code
Apr 01, 2025
Viaarxiv icon

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Add code
Apr 01, 2025
Viaarxiv icon

H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding

Add code
Mar 31, 2025
Viaarxiv icon

Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens

Add code
Mar 12, 2025
Figure 1 for Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens
Figure 2 for Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens
Figure 3 for Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens
Figure 4 for Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens
Viaarxiv icon

PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Add code
Dec 02, 2024
Figure 1 for PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Figure 2 for PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Figure 3 for PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Figure 4 for PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control
Viaarxiv icon

Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation

Add code
Jun 24, 2024
Figure 1 for Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation
Figure 2 for Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation
Figure 3 for Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation
Figure 4 for Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation
Viaarxiv icon

MLCM: Multistep Consistency Distillation of Latent Diffusion Model

Add code
Jun 12, 2024
Figure 1 for MLCM: Multistep Consistency Distillation of Latent Diffusion Model
Figure 2 for MLCM: Multistep Consistency Distillation of Latent Diffusion Model
Figure 3 for MLCM: Multistep Consistency Distillation of Latent Diffusion Model
Figure 4 for MLCM: Multistep Consistency Distillation of Latent Diffusion Model
Viaarxiv icon

Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification

Add code
May 28, 2024
Figure 1 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Figure 2 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Figure 3 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Figure 4 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Viaarxiv icon