Picture for Ziping Ma

Ziping Ma

M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining

Add code
Feb 04, 2024
Figure 1 for M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
Figure 2 for M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
Figure 3 for M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
Figure 4 for M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
Viaarxiv icon

SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment

Add code
Jan 04, 2024
Figure 1 for SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Figure 2 for SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Figure 3 for SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Figure 4 for SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Viaarxiv icon