Picture for Mikolaj Kegler

Mikolaj Kegler

CATSE: A Context-Aware Framework for Causal Target Sound Extraction

Add code
Mar 21, 2024
Viaarxiv icon

Latent CLAP Loss for Better Foley Sound Synthesis

Add code
Mar 18, 2024
Figure 1 for Latent CLAP Loss for Better Foley Sound Synthesis
Figure 2 for Latent CLAP Loss for Better Foley Sound Synthesis
Figure 3 for Latent CLAP Loss for Better Foley Sound Synthesis
Figure 4 for Latent CLAP Loss for Better Foley Sound Synthesis
Viaarxiv icon

Two-Step Knowledge Distillation for Tiny Speech Enhancement

Add code
Sep 15, 2023
Viaarxiv icon

Self-Supervised Learning for Speech Enhancement through Synthesis

Add code
Nov 04, 2022
Viaarxiv icon

BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping

Add code
Jun 30, 2022
Figure 1 for BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Figure 2 for BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Figure 3 for BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Figure 4 for BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Viaarxiv icon

Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load

Add code
Mar 30, 2022
Figure 1 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Figure 2 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Figure 3 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Figure 4 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Viaarxiv icon

SERAB: A multi-lingual benchmark for speech emotion recognition

Add code
Oct 07, 2021
Figure 1 for SERAB: A multi-lingual benchmark for speech emotion recognition
Figure 2 for SERAB: A multi-lingual benchmark for speech emotion recognition
Figure 3 for SERAB: A multi-lingual benchmark for speech emotion recognition
Figure 4 for SERAB: A multi-lingual benchmark for speech emotion recognition
Viaarxiv icon

Speech-VGG: A deep feature extractor for speech processing

Add code
Oct 22, 2019
Figure 1 for Speech-VGG: A deep feature extractor for speech processing
Figure 2 for Speech-VGG: A deep feature extractor for speech processing
Figure 3 for Speech-VGG: A deep feature extractor for speech processing
Figure 4 for Speech-VGG: A deep feature extractor for speech processing
Viaarxiv icon

Deep speech inpainting of time-frequency masks

Add code
Oct 22, 2019
Figure 1 for Deep speech inpainting of time-frequency masks
Figure 2 for Deep speech inpainting of time-frequency masks
Figure 3 for Deep speech inpainting of time-frequency masks
Figure 4 for Deep speech inpainting of time-frequency masks
Viaarxiv icon