Speaker Diarization


Speaker diarization is the process of segmenting and clustering speech signals to identify different speakers in an audio recording.

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding

Add code
Jun 14, 2025
Viaarxiv icon

SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors

Add code
Mar 20, 2025
Viaarxiv icon

Language Modelling for Speaker Diarization in Telephonic Interviews

Add code
Jan 28, 2025
Figure 1 for Language Modelling for Speaker Diarization in Telephonic Interviews
Figure 2 for Language Modelling for Speaker Diarization in Telephonic Interviews
Figure 3 for Language Modelling for Speaker Diarization in Telephonic Interviews
Figure 4 for Language Modelling for Speaker Diarization in Telephonic Interviews
Viaarxiv icon

SCDiar: a streaming diarization system based on speaker change detection and speech recognition

Add code
Jan 28, 2025
Figure 1 for SCDiar: a streaming diarization system based on speaker change detection and speech recognition
Figure 2 for SCDiar: a streaming diarization system based on speaker change detection and speech recognition
Figure 3 for SCDiar: a streaming diarization system based on speaker change detection and speech recognition
Figure 4 for SCDiar: a streaming diarization system based on speaker change detection and speech recognition
Viaarxiv icon

Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond

Add code
Feb 06, 2025
Viaarxiv icon

Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge

Add code
Feb 18, 2025
Viaarxiv icon

SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models

Add code
Jan 14, 2025
Figure 1 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 2 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 3 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 4 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Viaarxiv icon

DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition

Add code
Dec 30, 2024
Viaarxiv icon

Unsupervised Speech Segmentation: A General Approach Using Speech Language Models

Add code
Jan 07, 2025
Figure 1 for Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Figure 2 for Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Figure 3 for Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Figure 4 for Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Viaarxiv icon

Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection

Add code
Jan 07, 2025
Figure 1 for Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection
Figure 2 for Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection
Figure 3 for Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection
Figure 4 for Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection
Viaarxiv icon