Picture for Arent van Korlaar

Arent van Korlaar

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Add code
Feb 15, 2024
Figure 1 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Figure 2 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Figure 3 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Figure 4 for BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Viaarxiv icon

Controllable Emphasis with zero data for text-to-speech

Add code
Jul 13, 2023
Viaarxiv icon

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

Add code
Jun 27, 2022
Figure 1 for CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
Figure 2 for CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
Viaarxiv icon

Distribution augmentation for low-resource expressive text-to-speech

Add code
Feb 19, 2022
Figure 1 for Distribution augmentation for low-resource expressive text-to-speech
Figure 2 for Distribution augmentation for low-resource expressive text-to-speech
Figure 3 for Distribution augmentation for low-resource expressive text-to-speech
Figure 4 for Distribution augmentation for low-resource expressive text-to-speech
Viaarxiv icon