Picture for Ye Bai

Ye Bai

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Add code
Sep 13, 2024
Figure 1 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 2 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 3 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 4 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Viaarxiv icon

NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training

Add code
Sep 13, 2024
Figure 1 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Figure 2 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Figure 3 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Figure 4 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Viaarxiv icon

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

Add code
Jul 05, 2024
Viaarxiv icon

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking

Add code
Jun 07, 2024
Viaarxiv icon

Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation

Add code
Apr 17, 2024
Viaarxiv icon

PolyVoice: Language Models for Speech to Speech Translation

Add code
Jun 13, 2023
Figure 1 for PolyVoice: Language Models for Speech to Speech Translation
Figure 2 for PolyVoice: Language Models for Speech to Speech Translation
Figure 3 for PolyVoice: Language Models for Speech to Speech Translation
Figure 4 for PolyVoice: Language Models for Speech to Speech Translation
Viaarxiv icon

Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition

Add code
Sep 17, 2022
Figure 1 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Figure 2 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Figure 3 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Figure 4 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Viaarxiv icon

ADD 2022: the First Audio Deep Synthesis Detection Challenge

Add code
Feb 26, 2022
Figure 1 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Figure 2 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Figure 3 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Figure 4 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Viaarxiv icon

Continual Learning for Fake Audio Detection

Add code
Apr 15, 2021
Figure 1 for Continual Learning for Fake Audio Detection
Figure 2 for Continual Learning for Fake Audio Detection
Figure 3 for Continual Learning for Fake Audio Detection
Figure 4 for Continual Learning for Fake Audio Detection
Viaarxiv icon

Half-Truth: A Partially Fake Audio Detection Dataset

Add code
Apr 08, 2021
Figure 1 for Half-Truth: A Partially Fake Audio Detection Dataset
Figure 2 for Half-Truth: A Partially Fake Audio Detection Dataset
Figure 3 for Half-Truth: A Partially Fake Audio Detection Dataset
Figure 4 for Half-Truth: A Partially Fake Audio Detection Dataset
Viaarxiv icon