Picture for Thomas Hueber

Thomas Hueber

GIPSA-CRISSP

Simulating Articulatory Trajectories with Phonological Feature Interpolation

Add code
Aug 08, 2024
Viaarxiv icon

Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting

Add code
May 30, 2024
Viaarxiv icon

Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding

Add code
Jun 14, 2023
Viaarxiv icon

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

Add code
Jul 04, 2022
Figure 1 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Figure 2 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Figure 3 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Figure 4 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Viaarxiv icon

Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE

Add code
Jun 17, 2022
Figure 1 for Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Figure 2 for Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Figure 3 for Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Viaarxiv icon

Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding

Add code
Apr 11, 2022
Figure 1 for Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
Figure 2 for Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
Figure 3 for Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
Viaarxiv icon

Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation

Add code
Apr 05, 2022
Figure 1 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Figure 2 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Figure 3 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Figure 4 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Viaarxiv icon

A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling

Add code
Jun 14, 2021
Figure 1 for A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
Figure 2 for A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
Viaarxiv icon

Learning robust speech representation with an articulatory-regularized variational autoencoder

Add code
Apr 07, 2021
Figure 1 for Learning robust speech representation with an articulatory-regularized variational autoencoder
Figure 2 for Learning robust speech representation with an articulatory-regularized variational autoencoder
Figure 3 for Learning robust speech representation with an articulatory-regularized variational autoencoder
Figure 4 for Learning robust speech representation with an articulatory-regularized variational autoencoder
Viaarxiv icon

Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input

Add code
Feb 19, 2021
Figure 1 for Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Figure 2 for Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Figure 3 for Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Figure 4 for Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Viaarxiv icon