Picture for Shota Horiguchi

Shota Horiguchi

Guided Speaker Embedding

Add code
Oct 16, 2024
Figure 1 for Guided Speaker Embedding
Figure 2 for Guided Speaker Embedding
Figure 3 for Guided Speaker Embedding
Figure 4 for Guided Speaker Embedding
Viaarxiv icon

Investigation of Speaker Representation for Target-Speaker Speech Processing

Add code
Oct 15, 2024
Figure 1 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 2 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 3 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 4 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Viaarxiv icon

Mamba-based Segmentation Model for Speaker Diarization

Add code
Oct 10, 2024
Figure 1 for Mamba-based Segmentation Model for Speaker Diarization
Figure 2 for Mamba-based Segmentation Model for Speaker Diarization
Figure 3 for Mamba-based Segmentation Model for Speaker Diarization
Figure 4 for Mamba-based Segmentation Model for Speaker Diarization
Viaarxiv icon

Alignment-Free Training for Transducer-based Multi-Talker ASR

Add code
Sep 30, 2024
Figure 1 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 2 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 3 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 4 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings

Add code
Aug 30, 2024
Viaarxiv icon

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

Add code
Jul 01, 2024
Viaarxiv icon

Factor-Conditioned Speaking-Style Captioning

Add code
Jun 27, 2024
Figure 1 for Factor-Conditioned Speaking-Style Captioning
Figure 2 for Factor-Conditioned Speaking-Style Captioning
Figure 3 for Factor-Conditioned Speaking-Style Captioning
Figure 4 for Factor-Conditioned Speaking-Style Captioning
Viaarxiv icon

Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits

Add code
Feb 13, 2024
Figure 1 for Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
Figure 2 for Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
Figure 3 for Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
Figure 4 for Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
Viaarxiv icon

Streaming Active Learning for Regression Problems Using Regression via Classification

Add code
Sep 02, 2023
Figure 1 for Streaming Active Learning for Regression Problems Using Regression via Classification
Figure 2 for Streaming Active Learning for Regression Problems Using Regression via Classification
Figure 3 for Streaming Active Learning for Regression Problems Using Regression via Classification
Viaarxiv icon