Picture for Sebastian Stueker

Sebastian Stueker

Efficient Weight factorization for Multilingual Speech Recognition

Add code
May 07, 2021
Figure 1 for Efficient Weight factorization for Multilingual Speech Recognition
Figure 2 for Efficient Weight factorization for Multilingual Speech Recognition
Viaarxiv icon

Super-Human Performance in Online Low-latency Recognition of Conversational Speech

Add code
Oct 22, 2020
Figure 1 for Super-Human Performance in Online Low-latency Recognition of Conversational Speech
Figure 2 for Super-Human Performance in Online Low-latency Recognition of Conversational Speech
Figure 3 for Super-Human Performance in Online Low-latency Recognition of Conversational Speech
Figure 4 for Super-Human Performance in Online Low-latency Recognition of Conversational Speech
Viaarxiv icon

Relative Positional Encoding for Speech Recognition and Direct Translation

Add code
May 20, 2020
Figure 1 for Relative Positional Encoding for Speech Recognition and Direct Translation
Figure 2 for Relative Positional Encoding for Speech Recognition and Direct Translation
Figure 3 for Relative Positional Encoding for Speech Recognition and Direct Translation
Figure 4 for Relative Positional Encoding for Speech Recognition and Direct Translation
Viaarxiv icon

High Performance Sequence-to-Sequence Model for Streaming Speech Recognition

Add code
Mar 22, 2020
Figure 1 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Figure 2 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Figure 3 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Figure 4 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Viaarxiv icon

Low Latency ASR for Simultaneous Speech Translation

Add code
Mar 22, 2020
Figure 1 for Low Latency ASR for Simultaneous Speech Translation
Figure 2 for Low Latency ASR for Simultaneous Speech Translation
Figure 3 for Low Latency ASR for Simultaneous Speech Translation
Figure 4 for Low Latency ASR for Simultaneous Speech Translation
Viaarxiv icon

Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation

Add code
Oct 29, 2019
Figure 1 for Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
Figure 2 for Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
Figure 3 for Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
Figure 4 for Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
Viaarxiv icon

Learning Shared Encoding Representation for End-to-End Speech Recognition Models

Add code
Mar 31, 2019
Figure 1 for Learning Shared Encoding Representation for End-to-End Speech Recognition Models
Figure 2 for Learning Shared Encoding Representation for End-to-End Speech Recognition Models
Figure 3 for Learning Shared Encoding Representation for End-to-End Speech Recognition Models
Figure 4 for Learning Shared Encoding Representation for End-to-End Speech Recognition Models
Viaarxiv icon

Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models

Add code
Feb 02, 2019
Figure 1 for Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models
Figure 2 for Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models
Figure 3 for Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models
Figure 4 for Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models
Viaarxiv icon

Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop

Add code
Feb 14, 2018
Figure 1 for Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop
Figure 2 for Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop
Figure 3 for Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop
Viaarxiv icon