Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Jan 18, 2024

Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li

Figure 1 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Figure 2 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Figure 3 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Figure 4 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Share this with someone who'll enjoy it:

Abstract:Transformer architecture has enabled recent progress in speech enhancement. Since Transformers are position-agostic, positional encoding is the de facto standard component used to enable Transformers to distinguish the order of elements in a sequence. However, it remains unclear how positional encoding exactly impacts speech enhancement based on Transformer architectures. In this paper, we perform a comprehensive empirical study evaluating five positional encoding methods, i.e., Sinusoidal and learned absolute position embedding (APE), T5-RPE, KERPLE, as well as the Transformer without positional encoding (No-Pos), across both causal and noncausal configurations. We conduct extensive speech enhancement experiments, involving spectral mapping and masking methods. Our findings establish that positional encoding is not quite helpful for the models in a causal configuration, which indicates that causal attention may implicitly incorporate position information. In a noncausal configuration, the models significantly benefit from the use of positional encoding. In addition, we find that among the four position embeddings, relative position embeddings outperform APEs.

* ICASSP 2024

View paper on

Share this with someone who'll enjoy it:

Title:An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Paper and Code