Picture for Detai Xin

Detai Xin

BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec

Add code
Sep 09, 2024
Figure 1 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Figure 2 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Figure 3 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Figure 4 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Viaarxiv icon

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Add code
Apr 06, 2024
Viaarxiv icon

Building speech corpus with diverse voice characteristics for its prompt-based representation

Add code
Mar 20, 2024
Viaarxiv icon

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Mar 05, 2024
Viaarxiv icon

JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions

Add code
Oct 09, 2023
Viaarxiv icon

Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control

Add code
Sep 24, 2023
Figure 1 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 2 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 3 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 4 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Viaarxiv icon

How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics

Add code
Jun 01, 2023
Figure 1 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Figure 2 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Figure 3 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Figure 4 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Viaarxiv icon

Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus

Add code
May 26, 2023
Viaarxiv icon

JNV Corpus: A Corpus of Japanese Nonverbal Vocalizations with Diverse Phrases and Emotions

Add code
May 21, 2023
Viaarxiv icon

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Add code
Feb 27, 2023
Viaarxiv icon