Picture for Takaaki Hori

Takaaki Hori

Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval

Add code
Nov 04, 2024
Viaarxiv icon

End-to-End Speech Recognition: A Survey

Add code
Mar 03, 2023
Viaarxiv icon

Variable Attention Masking for Configurable Transformer Transducer Speech Recognition

Add code
Nov 02, 2022
Viaarxiv icon

Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR

Add code
Mar 01, 2022
Figure 1 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 2 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 3 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 4 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Viaarxiv icon

Sequence Transduction with Graph-based Supervision

Add code
Nov 01, 2021
Figure 1 for Sequence Transduction with Graph-based Supervision
Figure 2 for Sequence Transduction with Graph-based Supervision
Figure 3 for Sequence Transduction with Graph-based Supervision
Viaarxiv icon

Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning

Add code
Oct 13, 2021
Figure 1 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 2 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 3 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 4 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Viaarxiv icon

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy

Add code
Oct 11, 2021
Figure 1 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 2 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 3 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Viaarxiv icon

Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers

Add code
Aug 04, 2021
Figure 1 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Figure 2 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Figure 3 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Figure 4 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Viaarxiv icon

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition

Add code
Jul 02, 2021
Figure 1 for Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Figure 2 for Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Viaarxiv icon

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Add code
Jun 16, 2021
Figure 1 for Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Figure 2 for Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Viaarxiv icon