Picture for Goeric Huybrechts

Goeric Huybrechts

Adaptive Video Understanding Agent: Enhancing efficiency with dynamic frame sampling and feedback-driven reasoning

Add code
Oct 26, 2024
Viaarxiv icon

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Add code
May 14, 2024
Viaarxiv icon

DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer

Add code
Jun 13, 2023
Viaarxiv icon

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR

Add code
Apr 25, 2023
Viaarxiv icon

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

Add code
Jul 29, 2022
Figure 1 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Figure 2 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Figure 3 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Figure 4 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Viaarxiv icon

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

Add code
Feb 16, 2022
Figure 1 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 2 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 3 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 4 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Viaarxiv icon

Cross-speaker style transfer for text-to-speech using data augmentation

Add code
Feb 10, 2022
Figure 1 for Cross-speaker style transfer for text-to-speech using data augmentation
Figure 2 for Cross-speaker style transfer for text-to-speech using data augmentation
Figure 3 for Cross-speaker style transfer for text-to-speech using data augmentation
Figure 4 for Cross-speaker style transfer for text-to-speech using data augmentation
Viaarxiv icon

Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech

Add code
Jun 25, 2021
Figure 1 for Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Figure 2 for Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Figure 3 for Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Figure 4 for Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Viaarxiv icon

EmoCat: Language-agnostic Emotional Voice Conversion

Add code
Jan 14, 2021
Figure 1 for EmoCat: Language-agnostic Emotional Voice Conversion
Figure 2 for EmoCat: Language-agnostic Emotional Voice Conversion
Viaarxiv icon

Voice Conversion for Whispered Speech Synthesis

Add code
Jan 17, 2020
Figure 1 for Voice Conversion for Whispered Speech Synthesis
Figure 2 for Voice Conversion for Whispered Speech Synthesis
Figure 3 for Voice Conversion for Whispered Speech Synthesis
Figure 4 for Voice Conversion for Whispered Speech Synthesis
Viaarxiv icon