Picture for Georg Kucsko

Georg Kucsko

Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation

Add code
Sep 09, 2024
Figure 1 for Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Figure 2 for Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Figure 3 for Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Figure 4 for Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Viaarxiv icon

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

Add code
Apr 06, 2021
Figure 1 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 2 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 3 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 4 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Viaarxiv icon