Picture for Ron Hoory

Ron Hoory

Continuous Speech Synthesis using per-token Latent Diffusion

Add code
Oct 21, 2024
Viaarxiv icon

Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations

Add code
Mar 17, 2024
Figure 1 for Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations
Figure 2 for Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations
Figure 3 for Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations
Figure 4 for Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations
Viaarxiv icon

Speak While You Think: Streaming Speech Synthesis During Text Generation

Add code
Sep 20, 2023
Figure 1 for Speak While You Think: Streaming Speech Synthesis During Text Generation
Figure 2 for Speak While You Think: Streaming Speech Synthesis During Text Generation
Figure 3 for Speak While You Think: Streaming Speech Synthesis During Text Generation
Figure 4 for Speak While You Think: Streaming Speech Synthesis During Text Generation
Viaarxiv icon

Towards a Common Speech Analysis Engine

Add code
Mar 01, 2022
Figure 1 for Towards a Common Speech Analysis Engine
Figure 2 for Towards a Common Speech Analysis Engine
Figure 3 for Towards a Common Speech Analysis Engine
Figure 4 for Towards a Common Speech Analysis Engine
Viaarxiv icon

A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets

Add code
Feb 21, 2022
Figure 1 for A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets
Figure 2 for A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets
Figure 3 for A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets
Viaarxiv icon

Speech Emotion Recognition using Self-Supervised Features

Add code
Feb 07, 2022
Figure 1 for Speech Emotion Recognition using Self-Supervised Features
Figure 2 for Speech Emotion Recognition using Self-Supervised Features
Figure 3 for Speech Emotion Recognition using Self-Supervised Features
Figure 4 for Speech Emotion Recognition using Self-Supervised Features
Viaarxiv icon

Speaker Normalization for Self-supervised Speech Emotion Recognition

Add code
Feb 02, 2022
Figure 1 for Speaker Normalization for Self-supervised Speech Emotion Recognition
Figure 2 for Speaker Normalization for Self-supervised Speech Emotion Recognition
Figure 3 for Speaker Normalization for Self-supervised Speech Emotion Recognition
Viaarxiv icon

RNN Transducer Models For Spoken Language Understanding

Add code
Apr 08, 2021
Figure 1 for RNN Transducer Models For Spoken Language Understanding
Figure 2 for RNN Transducer Models For Spoken Language Understanding
Figure 3 for RNN Transducer Models For Spoken Language Understanding
Figure 4 for RNN Transducer Models For Spoken Language Understanding
Viaarxiv icon

Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems

Add code
Oct 08, 2020
Figure 1 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Figure 2 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Figure 3 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Figure 4 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Viaarxiv icon

End-to-End Spoken Language Understanding Without Full Transcripts

Add code
Sep 30, 2020
Figure 1 for End-to-End Spoken Language Understanding Without Full Transcripts
Figure 2 for End-to-End Spoken Language Understanding Without Full Transcripts
Figure 3 for End-to-End Spoken Language Understanding Without Full Transcripts
Figure 4 for End-to-End Spoken Language Understanding Without Full Transcripts
Viaarxiv icon