Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

Add code
May 10, 2021
Figure 1 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 2 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 3 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 4 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: