Picture for Man-Wai Mak

Man-Wai Mak

EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model

Add code
Sep 19, 2025
Figure 1 for EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
Figure 2 for EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
Figure 3 for EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
Figure 4 for EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
Viaarxiv icon

Class Unbiasing for Generalization in Medical Diagnosis

Add code
Aug 09, 2025
Viaarxiv icon

Subband Architecture Aided Selective Fixed-Filter Active Noise Control

Add code
Aug 01, 2025
Viaarxiv icon

Bayesian Learning for Domain-Invariant Speaker Verification and Anti-Spoofing

Add code
Jun 09, 2025
Viaarxiv icon

Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives

Add code
Jan 07, 2025
Figure 1 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Figure 2 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Figure 3 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Figure 4 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Viaarxiv icon

On the effectiveness of enrollment speech augmentation for Target Speaker Extraction

Add code
Sep 15, 2024
Figure 1 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Figure 2 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Figure 3 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Figure 4 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Viaarxiv icon

VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis

Add code
Mar 01, 2024
Figure 1 for VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Figure 2 for VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Figure 3 for VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Figure 4 for VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Viaarxiv icon

Phonetic-aware speaker embedding for far-field speaker verification

Add code
Nov 27, 2023
Viaarxiv icon

Contrastive Speaker Embedding With Sequential Disentanglement

Add code
Sep 23, 2023
Figure 1 for Contrastive Speaker Embedding With Sequential Disentanglement
Figure 2 for Contrastive Speaker Embedding With Sequential Disentanglement
Figure 3 for Contrastive Speaker Embedding With Sequential Disentanglement
Figure 4 for Contrastive Speaker Embedding With Sequential Disentanglement
Viaarxiv icon

Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification

Add code
Sep 08, 2023
Figure 1 for Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification
Figure 2 for Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification
Figure 3 for Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification
Figure 4 for Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification
Viaarxiv icon