Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Jun 16, 2023

Huang Xie, Khazar Khorrami, Okko Räsänen, Tuomas Virtanen

Figure 1 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Figure 2 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Figure 3 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Figure 4 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Share this with someone who'll enjoy it:

Abstract:This paper explores grading text-based audio retrieval relevances with crowdsourcing assessments. Given a free-form text (e.g., a caption) as a query, crowdworkers are asked to grade audio clips using numeric scores (between 0 and 100) to indicate their judgements of how much the sound content of an audio clip matches the text, where 0 indicates no content match at all and 100 indicates perfect content match. We integrate the crowdsourced relevances into training and evaluating text-based audio retrieval systems, and evaluate the effect of using them together with binary relevances from audio captioning. Conventionally, these binary relevances are defined by captioning-based audio-caption pairs, where being positive indicates that the caption describes the paired audio, and being negative applies to all other pairs. Experimental results indicate that there is no clear benefit from incorporating crowdsourced relevances alongside binary relevances when the crowdsourced relevances are binarized for contrastive learning. Conversely, the results suggest that using only binary relevances defined by captioning-based audio-caption pairs is sufficient for contrastive learning.

View paper on

Share this with someone who'll enjoy it:

Title:Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Paper and Code