Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Apr 11, 2023

Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi(+6 more)

Figure 1 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Figure 2 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Figure 3 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Figure 4 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Share this with someone who'll enjoy it:

Abstract:ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitated by the broadening interests of the spoken language translation community. ESPnet-ST-v2 supports 1) offline speech-to-text translation (ST), 2) simultaneous speech-to-text translation (SST), and 3) offline speech-to-speech translation (S2ST) -- each task is supported with a wide variety of approaches, differentiating ESPnet-ST-v2 from other open source spoken language translation toolkits. This toolkit offers state-of-the-art architectures such as transducers, hybrid CTC/attention, multi-decoders with searchable intermediates, time-synchronous blockwise CTC/attention, Translatotron models, and direct discrete unit models. In this paper, we describe the overall design, example models for each task, and performance benchmarking behind ESPnet-ST-v2, which is publicly available at https://github.com/espnet/espnet.

* There will be some major updates to the paper. Thus, withdrawn

View paper on

Share this with someone who'll enjoy it:

Title:ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Paper and Code