Picture for Tetsunori Kobayashi

Tetsunori Kobayashi

End-to-End Speech Recognition with Pre-trained Masked Language Model

Add code
Oct 01, 2024
Figure 1 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon

Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems

Add code
Sep 30, 2024
Figure 1 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Figure 2 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Figure 3 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Figure 4 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Viaarxiv icon

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction

Add code
Oct 12, 2023
Viaarxiv icon

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

Add code
Sep 19, 2023
Viaarxiv icon

Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition

Add code
Sep 09, 2023
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Add code
Nov 02, 2022
Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Add code
Nov 02, 2022
Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

Conversation-oriented ASR with multi-look-ahead CBS architecture

Add code
Nov 02, 2022
Viaarxiv icon

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model

Add code
Oct 29, 2022
Viaarxiv icon

An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR

Add code
Oct 20, 2021
Figure 1 for An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Figure 2 for An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Figure 3 for An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Viaarxiv icon