Picture for HongFa Wang

HongFa Wang

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Add code
Oct 14, 2023
Figure 1 for LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Figure 2 for LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Figure 3 for LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Figure 4 for LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Viaarxiv icon