Picture for Marc Delcroix

Marc Delcroix

Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization

Add code
Nov 04, 2024
Figure 1 for Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization
Figure 2 for Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization
Figure 3 for Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization
Figure 4 for Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization
Viaarxiv icon

Guided Speaker Embedding

Add code
Oct 16, 2024
Figure 1 for Guided Speaker Embedding
Figure 2 for Guided Speaker Embedding
Figure 3 for Guided Speaker Embedding
Figure 4 for Guided Speaker Embedding
Viaarxiv icon

Investigation of Speaker Representation for Target-Speaker Speech Processing

Add code
Oct 15, 2024
Figure 1 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 2 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 3 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 4 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Viaarxiv icon

Mamba-based Segmentation Model for Speaker Diarization

Add code
Oct 10, 2024
Figure 1 for Mamba-based Segmentation Model for Speaker Diarization
Figure 2 for Mamba-based Segmentation Model for Speaker Diarization
Figure 3 for Mamba-based Segmentation Model for Speaker Diarization
Figure 4 for Mamba-based Segmentation Model for Speaker Diarization
Viaarxiv icon

Alignment-Free Training for Transducer-based Multi-Talker ASR

Add code
Sep 30, 2024
Figure 1 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 2 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 3 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 4 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings

Add code
Aug 30, 2024
Viaarxiv icon

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation

Add code
Aug 01, 2024
Figure 1 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Figure 2 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Figure 3 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Figure 4 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Viaarxiv icon

Interaural time difference loss for binaural target sound extraction

Add code
Aug 01, 2024
Viaarxiv icon

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

Add code
Jul 01, 2024
Viaarxiv icon