Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:An empirical study of Conv-TasNet

Feb 24, 2020

Berkan Kadioglu, Michael Horgan, Xiaoyu Liu, Jordi Pons, Dan Darcy, Vivek Kumar

Figure 1 for An empirical study of Conv-TasNet

Figure 2 for An empirical study of Conv-TasNet

Figure 3 for An empirical study of Conv-TasNet

Figure 4 for An empirical study of Conv-TasNet

Share this with someone who'll enjoy it:

Abstract:Conv-TasNet is a recently proposed waveform-based deep neural network that achieves state-of-the-art performance in speech source separation. Its architecture consists of a learnable encoder/decoder and a separator that operates on top of this learned space. Various improvements have been proposed to Conv-TasNet. However, they mostly focus on the separator, leaving its encoder/decoder as a (shallow) linear operator. In this paper, we conduct an empirical study of Conv-TasNet and propose an enhancement to the encoder/decoder that is based on a (deep) non-linear variant of it. In addition, we experiment with the larger and more diverse LibriTTS dataset and investigate the generalization capabilities of the studied models when trained on a much larger dataset. We propose cross-dataset evaluation that includes assessing separations from the WSJ0-2mix, LibriTTS and VCTK databases. Our results show that enhancements to the encoder/decoder can improve average SI-SNR performance by more than 1 dB. Furthermore, we offer insights into the generalization capabilities of Conv-TasNet and the potential value of improvements to the encoder/decoder.

* In proceedings of ICASSP2020

View paper on

Share this with someone who'll enjoy it:

Title:An empirical study of Conv-TasNet

Paper and Code