Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Aug 23, 2023

Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada, Kunio Kashino

Figure 1 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Figure 2 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Figure 3 for Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Share this with someone who'll enjoy it:

Abstract:We proposed Audio Difference Captioning (ADC) as a new extension task of audio captioning for describing the semantic differences between input pairs of similar but slightly different audio clips. The ADC solves the problem that conventional audio captioning sometimes generates similar captions for similar audio clips, failing to describe the difference in content. We also propose a cross-attention-concentrated transformer encoder to extract differences by comparing a pair of audio clips and a similarity-discrepancy disentanglement to emphasize the difference in the latent space. To evaluate the proposed methods, we built an AudioDiffCaps dataset consisting of pairs of similar but slightly different audio clips with human-annotated descriptions of their differences. The experiment with the AudioDiffCaps dataset showed that the proposed methods solve the ADC task effectively and improve the attention weights to extract the difference by visualizing them in the transformer encoder.

* Accepted to DCASE2023 Workshop

View paper on

Share this with someone who'll enjoy it:

Title:Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Paper and Code