Picture for Ming Tao

Ming Tao

SoulX-Transcriber: A Robust End-to-End Framework for Multi-Speaker Speech Transcription

Add code
Jun 01, 2026
Viaarxiv icon

Video as Natural Augmentation: Towards Unified AI-Generated Image and Video Detection

Add code
May 21, 2026
Viaarxiv icon

SoulX-Duplug: Plug-and-Play Streaming State Prediction Module for Realtime Full-Duplex Speech Conversation

Add code
Mar 16, 2026
Viaarxiv icon

SoulX-LiveAct: Towards Hour-Scale Real-Time Human Animation with Neighbor Forcing and ConvKV Memory

Add code
Mar 12, 2026
Viaarxiv icon

SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis

Add code
Feb 08, 2026
Viaarxiv icon

SoulX-FlashTalk: Real-Time Infinite Streaming of Audio-Driven Avatars via Self-Correcting Bidirectional Distillation

Add code
Jan 06, 2026
Viaarxiv icon

SoulX-LiveTalk: Real-Time Infinite Streaming of Audio-Driven Avatars via Self-Correcting Bidirectional Distillation

Add code
Dec 31, 2025
Viaarxiv icon

Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Add code
Jun 11, 2025
Figure 1 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Figure 2 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Figure 3 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Figure 4 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Viaarxiv icon

Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image

Add code
May 20, 2025
Figure 1 for Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Figure 2 for Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Figure 3 for Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Figure 4 for Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Viaarxiv icon

Do We Need to Design Specific Diffusion Models for Different Tasks? Try ONE-PIC

Add code
Dec 07, 2024
Viaarxiv icon