Picture for Rafael Valle

Rafael Valle

OMCAT: Omni Context Aware Transformer

Add code
Oct 15, 2024
Viaarxiv icon

Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

Add code
Oct 02, 2024
Figure 1 for Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Figure 2 for Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Figure 3 for Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Figure 4 for Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Viaarxiv icon

Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment

Add code
Jun 25, 2024
Viaarxiv icon

Improving Text-To-Audio Models with Synthetic Captions

Add code
Jun 18, 2024
Viaarxiv icon

Audio Dialogues: Dialogues dataset for audio and music understanding

Add code
Apr 11, 2024
Viaarxiv icon

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Add code
Feb 02, 2024
Viaarxiv icon

Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages

Add code
Jan 29, 2024
Viaarxiv icon

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

Add code
Oct 14, 2023
Viaarxiv icon

VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation

Add code
Mar 14, 2023
Viaarxiv icon

Multilingual Multiaccented Multispeaker TTS with RADTTS

Add code
Jan 24, 2023
Viaarxiv icon