Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV

Mar 23, 2023

Matteo Torcoli, Emanuël A. P. Habets

Figure 1 for Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV

Figure 2 for Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV

Figure 3 for Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV

Figure 4 for Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV

Share this with someone who'll enjoy it:

Abstract:In TV services, dialogue level personalization is key to meeting user preferences and needs. When dialogue and background sounds are not separately available from the production stage, Dialogue Separation (DS) can estimate them to enable personalization. DS was shown to provide clear benefits for the end user. Still, the estimated signals are not perfect, and some leakage can be introduced. This is undesired, especially during passages without dialogue. We propose to combine DS and Voice Activity Detection (VAD), both recently proposed for TV audio. When their combination suggests dialogue inactivity, background components leaking in the dialogue estimate are reassigned to the background estimate. A clear improvement of the audio quality is shown for dialogue-free signals, without performance drops when dialogue is active. A post-processed VAD estimate with improved detection accuracy is also generated. It is concluded that DS and VAD can improve each other and are better used together.

* Paper accepted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023), Rhodes, Greece

View paper on

Share this with someone who'll enjoy it:

Title:Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV

Paper and Code