Picture for Ahmed Hussen Abdelaziz

Ahmed Hussen Abdelaziz

Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models

Add code
Sep 16, 2024
Viaarxiv icon

Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels

Add code
Sep 16, 2024
Viaarxiv icon

Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

Add code
Jun 13, 2024
Viaarxiv icon

Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness

Add code
Jun 12, 2024
Viaarxiv icon

Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Add code
Feb 01, 2024
Viaarxiv icon

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

Add code
Jan 30, 2024
Figure 1 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 2 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 3 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 4 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Viaarxiv icon

Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features

Add code
Oct 23, 2023
Viaarxiv icon

Audiovisual Speech Synthesis using Tacotron2

Add code
Aug 03, 2020
Figure 1 for Audiovisual Speech Synthesis using Tacotron2
Figure 2 for Audiovisual Speech Synthesis using Tacotron2
Figure 3 for Audiovisual Speech Synthesis using Tacotron2
Figure 4 for Audiovisual Speech Synthesis using Tacotron2
Viaarxiv icon

Modality Dropout for Improved Performance-driven Talking Faces

Add code
May 27, 2020
Figure 1 for Modality Dropout for Improved Performance-driven Talking Faces
Figure 2 for Modality Dropout for Improved Performance-driven Talking Faces
Figure 3 for Modality Dropout for Improved Performance-driven Talking Faces
Figure 4 for Modality Dropout for Improved Performance-driven Talking Faces
Viaarxiv icon

Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement

Add code
May 06, 2020
Figure 1 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Figure 2 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Figure 3 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Figure 4 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Viaarxiv icon