Picture for Tarun Khajuria

Tarun Khajuria

How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models

Add code
Jun 18, 2024
Figure 1 for How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models
Figure 2 for How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models
Figure 3 for How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models
Figure 4 for How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models
Viaarxiv icon