Picture for Yanmin Qian

Yanmin Qian

Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification

Add code
Oct 22, 2024
Figure 1 for Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification
Figure 2 for Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification
Figure 3 for Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification
Figure 4 for Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification
Viaarxiv icon

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction

Add code
Sep 24, 2024
Figure 1 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 2 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 3 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 4 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Viaarxiv icon

Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models

Add code
Sep 11, 2024
Viaarxiv icon

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion

Add code
Sep 10, 2024
Viaarxiv icon

Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching

Add code
Sep 07, 2024
Figure 1 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Figure 2 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Figure 3 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Figure 4 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Viaarxiv icon

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

Add code
Jul 21, 2024
Viaarxiv icon

Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement

Add code
Jun 19, 2024
Figure 1 for Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
Figure 2 for Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
Viaarxiv icon

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

Add code
Jun 17, 2024
Viaarxiv icon

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems

Add code
Jun 13, 2024
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Viaarxiv icon