Picture for Korin Richmond

Korin Richmond

CSTR

Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations

Add code
Sep 26, 2024
Figure 1 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Figure 2 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Figure 3 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Figure 4 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Viaarxiv icon

Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models

Add code
Sep 25, 2024
Viaarxiv icon

Acquiring Pronunciation Knowledge from Transcribed Speech Audio via Multi-task Learning

Add code
Sep 15, 2024
Viaarxiv icon

AccentBox: Towards High-Fidelity Zero-Shot Accent Generation

Add code
Sep 13, 2024
Viaarxiv icon

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Add code
Jun 13, 2024
Viaarxiv icon

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Add code
Dec 22, 2023
Viaarxiv icon

Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks

Add code
Sep 22, 2022
Figure 1 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 2 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 3 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 4 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Viaarxiv icon

Automatic audiovisual synchronisation for ultrasound tongue imaging

Add code
May 31, 2021
Figure 1 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Figure 2 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Figure 3 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Figure 4 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Viaarxiv icon

Silent versus modal multi-speaker speech recognition from ultrasound and video

Add code
Feb 27, 2021
Figure 1 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Figure 2 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Figure 3 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Figure 4 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Viaarxiv icon

Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors

Add code
Feb 27, 2021
Figure 1 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Figure 2 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Figure 3 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Figure 4 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Viaarxiv icon