Picture for Fengrun Zhang

Fengrun Zhang

Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM

Add code
Sep 24, 2024
Figure 1 for Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
Figure 2 for Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
Figure 3 for Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
Figure 4 for Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
Viaarxiv icon

Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification

Add code
Sep 24, 2024
Figure 1 for Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
Figure 2 for Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
Figure 3 for Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
Figure 4 for Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
Viaarxiv icon

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

Add code
Sep 24, 2024
Viaarxiv icon

Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations

Add code
Sep 12, 2024
Figure 1 for Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
Figure 2 for Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
Figure 3 for Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
Figure 4 for Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
Viaarxiv icon

Multimodal Emotion Recognition with Vision-language Prompting and Modality Dropout

Add code
Sep 11, 2024
Viaarxiv icon