Picture for Yuping Wang

Yuping Wang

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Add code
Feb 06, 2025
Viaarxiv icon

A Privacy-Preserving Domain Adversarial Federated learning for multi-site brain functional connectivity analysis

Add code
Feb 03, 2025
Viaarxiv icon

Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model

Add code
Jan 13, 2025
Viaarxiv icon

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

Add code
Dec 19, 2024
Viaarxiv icon

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Add code
Sep 13, 2024
Figure 1 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 2 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 3 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 4 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Viaarxiv icon

MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN

Add code
Sep 11, 2024
Figure 1 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Figure 2 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Figure 3 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Figure 4 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Viaarxiv icon

StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion

Add code
Aug 05, 2024
Figure 1 for StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
Figure 2 for StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
Figure 3 for StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
Viaarxiv icon

Language Model Can Listen While Speaking

Add code
Aug 05, 2024
Figure 1 for Language Model Can Listen While Speaking
Figure 2 for Language Model Can Listen While Speaking
Figure 3 for Language Model Can Listen While Speaking
Figure 4 for Language Model Can Listen While Speaking
Viaarxiv icon

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

Add code
Jul 05, 2024
Viaarxiv icon

Improving Audio Generation with Visual Enhanced Caption

Add code
Jul 05, 2024
Viaarxiv icon