Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Jul 20, 2022

Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada, Kunio Kashino

Figure 1 for Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Figure 2 for Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Figure 3 for Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Figure 4 for Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Share this with someone who'll enjoy it:

Abstract:The amount of audio data available on public websites is growing rapidly, and an efficient mechanism for accessing the desired data is necessary. We propose a content-based audio retrieval method that can retrieve a target audio that is similar to but slightly different from the query audio by introducing auxiliary textual information which describes the difference between the query and target audio. While the range of conventional content-based audio retrieval is limited to audio that is similar to the query audio, the proposed method can adjust the retrieval range by adding an embedding of the auxiliary text query-modifier to the embedding of the query sample audio in a shared latent space. To evaluate our method, we built a dataset comprising two different audio clips and the text that describes the difference. The experimental results show that the proposed method retrieves the paired audio more accurately than the baseline. We also confirmed based on visualization that the proposed method obtains the shared latent space in which the audio difference and the corresponding text are represented as similar embedding vectors.

* Accepted to Interspeech 2022

View paper on

Share this with someone who'll enjoy it:

Title:Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Paper and Code