Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anton V. Porov

Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Mar 27, 2022

Sangjun Park, Kihyun Choo, Joohyung Lee, Anton V. Porov, Konstantin Osipov, June Sig Sung

Figure 1 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Figure 2 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Figure 3 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Figure 4 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Abstract:Text-to-Speech (TTS) services that run on edge devices have many advantages compared to cloud TTS, e.g., latency and privacy issues. However, neural vocoders with a low complexity and small model footprint inevitably generate annoying sounds. This study proposes a Bunched LPCNet2, an improved LPCNet architecture that provides highly efficient performance in high-quality for cloud servers and in a low-complexity for low-resource edge devices. Single logistic distribution achieves computational efficiency, and insightful tricks reduce the model footprint while maintaining speech quality. A DualRate architecture, which generates a lower sampling rate from a prosody model, is also proposed to reduce maintenance costs. The experiments demonstrate that Bunched LPCNet2 generates satisfactory speech quality with a model footprint of 1.1MB while operating faster than real-time on a RPi 3B. Our audio samples are available at https://srtts.github.io/bunchedLPCNet2.

* Submitted to INTERSPEECH 2022

Via

Access Paper or Ask Questions