Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Show from Tell: Audio-Visual Modelling in Clinical Settings

Oct 25, 2023

Jianbo Jiao, Mohammad Alsharid, Lior Drukker, Aris T. Papageorghiou, Andrew Zisserman, J. Alison Noble

Figure 1 for Show from Tell: Audio-Visual Modelling in Clinical Settings

Figure 2 for Show from Tell: Audio-Visual Modelling in Clinical Settings

Figure 3 for Show from Tell: Audio-Visual Modelling in Clinical Settings

Figure 4 for Show from Tell: Audio-Visual Modelling in Clinical Settings

Share this with someone who'll enjoy it:

Abstract:Auditory and visual signals usually present together and correlate with each other, not only in natural environments but also in clinical settings. However, the audio-visual modelling in the latter case can be more challenging, due to the different sources of audio/video signals and the noise (both signal-level and semantic-level) in auditory signals -- usually speech. In this paper, we consider audio-visual modelling in a clinical setting, providing a solution to learn medical representations that benefit various clinical tasks, without human expert annotation. A simple yet effective multi-modal self-supervised learning framework is proposed for this purpose. The proposed approach is able to localise anatomical regions of interest during ultrasound imaging, with only speech audio as a reference. Experimental evaluations on a large-scale clinical multi-modal ultrasound video dataset show that the proposed self-supervised method learns good transferable anatomical representations that boost the performance of automated downstream clinical tasks, even outperforming fully-supervised solutions.

View paper on

Share this with someone who'll enjoy it:

Title:Show from Tell: Audio-Visual Modelling in Clinical Settings

Paper and Code