Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Jun 18, 2019

Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, Minje Kim

Figure 1 for Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Figure 2 for Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Figure 3 for Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Figure 4 for Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Share this with someone who'll enjoy it:

Abstract:Speech codecs learn compact representations of speech signals to facilitate data transmission. Many recent deep neural network (DNN) based end-to-end speech codecs achieve low bitrates and high perceptual quality at the cost of model complexity. We propose a cross-module residual learning (CMRL) pipeline as a module carrier with each module reconstructing the residual from its preceding modules. CMRL differs from other DNN-based speech codecs, in that rather than modeling speech compression problem in a single large neural network, it optimizes a series of less-complicated modules in a two-phase training scheme. The proposed method shows better objective performance than AMR-WB and the state-of-the-art DNN-based speech codec with a similar network architecture. As an end-to-end model, it takes raw PCM signals as an input, but is also compatible with linear predictive coding (LPC), showing better subjective quality at high bitrates than AMR-WB and OPUS. The gain is achieved by using only 0.9 million trainable parameters, a significantly less complex architecture than the other DNN-based codecs in the literature.

* Accepted for publication in INTERSPEECH 2019

View paper on

Share this with someone who'll enjoy it:

Title:Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Paper and Code