Picture for Atsushi Ando

Atsushi Ando

Guided Speaker Embedding

Add code
Oct 16, 2024
Figure 1 for Guided Speaker Embedding
Figure 2 for Guided Speaker Embedding
Figure 3 for Guided Speaker Embedding
Figure 4 for Guided Speaker Embedding
Viaarxiv icon

Mamba-based Segmentation Model for Speaker Diarization

Add code
Oct 10, 2024
Figure 1 for Mamba-based Segmentation Model for Speaker Diarization
Figure 2 for Mamba-based Segmentation Model for Speaker Diarization
Figure 3 for Mamba-based Segmentation Model for Speaker Diarization
Figure 4 for Mamba-based Segmentation Model for Speaker Diarization
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings

Add code
Aug 30, 2024
Viaarxiv icon

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

Add code
Jul 01, 2024
Viaarxiv icon

Factor-Conditioned Speaking-Style Captioning

Add code
Jun 27, 2024
Viaarxiv icon

Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis

Add code
Feb 11, 2024
Viaarxiv icon

NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization

Add code
Sep 22, 2023
Viaarxiv icon

Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff

Add code
Aug 31, 2023
Viaarxiv icon

End-to-End Joint Target and Non-Target Speakers ASR

Add code
Jun 04, 2023
Viaarxiv icon