Picture for Rui Zhao

Rui Zhao

State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Add code
Mar 13, 2025
Viaarxiv icon

Motion Anything: Any to Motion Generation

Add code
Mar 10, 2025
Viaarxiv icon

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Add code
Mar 05, 2025
Viaarxiv icon

Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Add code
Feb 22, 2025
Viaarxiv icon

PedDet: Adaptive Spectral Optimization for Multimodal Pedestrian Detection

Add code
Feb 21, 2025
Viaarxiv icon

Sce2DriveX: A Generalized MLLM Framework for Scene-to-Drive Learning

Add code
Feb 19, 2025
Viaarxiv icon

Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model

Add code
Jan 01, 2025
Figure 1 for Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model
Figure 2 for Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model
Figure 3 for Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model
Figure 4 for Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model
Viaarxiv icon

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Add code
Dec 23, 2024
Viaarxiv icon

"They've Stolen My GPL-Licensed Model!": Toward Standardized and Transparent Model Licensing

Add code
Dec 16, 2024
Figure 1 for "They've Stolen My GPL-Licensed Model!": Toward Standardized and Transparent Model Licensing
Figure 2 for "They've Stolen My GPL-Licensed Model!": Toward Standardized and Transparent Model Licensing
Figure 3 for "They've Stolen My GPL-Licensed Model!": Toward Standardized and Transparent Model Licensing
Figure 4 for "They've Stolen My GPL-Licensed Model!": Toward Standardized and Transparent Model Licensing
Viaarxiv icon

RemDet: Rethinking Efficient Model Design for UAV Object Detection

Add code
Dec 13, 2024
Figure 1 for RemDet: Rethinking Efficient Model Design for UAV Object Detection
Figure 2 for RemDet: Rethinking Efficient Model Design for UAV Object Detection
Figure 3 for RemDet: Rethinking Efficient Model Design for UAV Object Detection
Figure 4 for RemDet: Rethinking Efficient Model Design for UAV Object Detection
Viaarxiv icon