Picture for Adriana Stan

Adriana Stan

Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch

Add code
Oct 09, 2024
Figure 1 for Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch
Figure 2 for Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch
Figure 3 for Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch
Figure 4 for Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch
Viaarxiv icon

TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition

Add code
Sep 16, 2024
Figure 1 for TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
Figure 2 for TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
Figure 3 for TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
Figure 4 for TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
Viaarxiv icon

WavLM model ensemble for audio deepfake detection

Add code
Aug 14, 2024
Figure 1 for WavLM model ensemble for audio deepfake detection
Figure 2 for WavLM model ensemble for audio deepfake detection
Figure 3 for WavLM model ensemble for audio deepfake detection
Figure 4 for WavLM model ensemble for audio deepfake detection
Viaarxiv icon

An analysis of large speech models-based representations for speech emotion recognition

Add code
Nov 01, 2023
Figure 1 for An analysis of large speech models-based representations for speech emotion recognition
Figure 2 for An analysis of large speech models-based representations for speech emotion recognition
Viaarxiv icon

Towards generalisable and calibrated synthetic speech detection with self-supervised representations

Add code
Sep 11, 2023
Viaarxiv icon

An analysis on the effects of speaker embedding choice in non auto-regressive TTS

Add code
Jul 19, 2023
Viaarxiv icon

Residual Information in Deep Speaker Embedding Architectures

Add code
Feb 06, 2023
Figure 1 for Residual Information in Deep Speaker Embedding Architectures
Figure 2 for Residual Information in Deep Speaker Embedding Architectures
Figure 3 for Residual Information in Deep Speaker Embedding Architectures
Figure 4 for Residual Information in Deep Speaker Embedding Architectures
Viaarxiv icon

The ZevoMOS entry to VoiceMOS Challenge 2022

Add code
Jun 15, 2022
Figure 1 for The ZevoMOS entry to VoiceMOS Challenge 2022
Figure 2 for The ZevoMOS entry to VoiceMOS Challenge 2022
Figure 3 for The ZevoMOS entry to VoiceMOS Challenge 2022
Viaarxiv icon

FlexLip: A Controllable Text-to-Lip System

Add code
Jun 07, 2022
Figure 1 for FlexLip: A Controllable Text-to-Lip System
Figure 2 for FlexLip: A Controllable Text-to-Lip System
Figure 3 for FlexLip: A Controllable Text-to-Lip System
Figure 4 for FlexLip: A Controllable Text-to-Lip System
Viaarxiv icon

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Add code
Jun 03, 2021
Figure 1 for An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Figure 2 for An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Figure 3 for An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Figure 4 for An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Viaarxiv icon