Picture for Tetsuji Ogawa

Tetsuji Ogawa

End-to-End Speech Recognition with Pre-trained Masked Language Model

Add code
Oct 01, 2024
Figure 1 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction

Add code
Oct 12, 2023
Viaarxiv icon

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

Add code
Sep 19, 2023
Viaarxiv icon

Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition

Add code
Sep 09, 2023
Viaarxiv icon

Remixing-based Unsupervised Source Separation from Scratch

Add code
Sep 01, 2023
Viaarxiv icon

Neural Diarization with Non-autoregressive Intermediate Attractors

Add code
Mar 13, 2023
Viaarxiv icon

Video Surveillance System Incorporating Expert Decision-making Process: A Case Study on Detecting Calving Signs in Cattle

Add code
Jan 10, 2023
Viaarxiv icon

Self-Remixing: Unsupervised Speech Separation via Separation and Remixing

Add code
Nov 18, 2022
Viaarxiv icon

Conversation-oriented ASR with multi-look-ahead CBS architecture

Add code
Nov 02, 2022
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Add code
Nov 02, 2022
Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon