Picture for Masato Mimura

Masato Mimura

Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding

Add code
Sep 30, 2024
Figure 1 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 2 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 3 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 4 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Viaarxiv icon

Alignment-Free Training for Transducer-based Multi-Talker ASR

Add code
Sep 30, 2024
Figure 1 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 2 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 3 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 4 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation

Add code
Aug 01, 2024
Figure 1 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Figure 2 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Figure 3 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Figure 4 for Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Viaarxiv icon

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

Add code
Jul 01, 2024
Viaarxiv icon

Time-domain Speech Enhancement Assisted by Multi-resolution Frequency Encoder and Decoder

Add code
Mar 26, 2023
Viaarxiv icon

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

Add code
Sep 08, 2022
Figure 1 for Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Figure 2 for Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Figure 3 for Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Figure 4 for Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Viaarxiv icon

Distilling the Knowledge of BERT for CTC-based ASR

Add code
Sep 05, 2022
Figure 1 for Distilling the Knowledge of BERT for CTC-based ASR
Figure 2 for Distilling the Knowledge of BERT for CTC-based ASR
Figure 3 for Distilling the Knowledge of BERT for CTC-based ASR
Figure 4 for Distilling the Knowledge of BERT for CTC-based ASR
Viaarxiv icon

ASR Rescoring and Confidence Estimation with ELECTRA

Add code
Oct 05, 2021
Figure 1 for ASR Rescoring and Confidence Estimation with ELECTRA
Figure 2 for ASR Rescoring and Confidence Estimation with ELECTRA
Figure 3 for ASR Rescoring and Confidence Estimation with ELECTRA
Figure 4 for ASR Rescoring and Confidence Estimation with ELECTRA
Viaarxiv icon

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR

Add code
Aug 09, 2020
Figure 1 for Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Figure 2 for Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Figure 3 for Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Figure 4 for Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Viaarxiv icon