Picture for Simon Ging

Simon Ging

University of Freiburg

Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy

Add code
Feb 11, 2024
Figure 1 for Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy
Figure 2 for Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy
Figure 3 for Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy
Figure 4 for Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy
Viaarxiv icon

Open-vocabulary Attribute Detection

Add code
Nov 23, 2022
Viaarxiv icon

COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning

Add code
Nov 01, 2020
Figure 1 for COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Figure 2 for COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Figure 3 for COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Figure 4 for COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Viaarxiv icon