Picture for Mateusz Łajszczak

Mateusz Łajszczak

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Add code
Feb 15, 2024
Figure 1 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Figure 2 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Figure 3 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Figure 4 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Viaarxiv icon

Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

Add code
Jun 29, 2022
Figure 1 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 2 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 3 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 4 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Viaarxiv icon

Discrete acoustic space for an efficient sampling in neural text-to-speech

Add code
Oct 24, 2021
Figure 1 for Discrete acoustic space for an efficient sampling in neural text-to-speech
Figure 2 for Discrete acoustic space for an efficient sampling in neural text-to-speech
Figure 3 for Discrete acoustic space for an efficient sampling in neural text-to-speech
Figure 4 for Discrete acoustic space for an efficient sampling in neural text-to-speech
Viaarxiv icon

In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data

Add code
Apr 04, 2019
Figure 1 for In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
Figure 2 for In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
Figure 3 for In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
Figure 4 for In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
Viaarxiv icon