CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

Add code
Sep 23, 2022
Figure 1 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 2 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 3 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 4 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: