Picture for Longchao Wang

Longchao Wang

ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval

Add code
Mar 31, 2022
Figure 1 for ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Figure 2 for ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Figure 3 for ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Figure 4 for ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Viaarxiv icon