Picture for Gorka Azkune

Gorka Azkune

Multimodal LLMs Do Not Compose Skills Optimally Across Modalities

Add code
Nov 12, 2025
Figure 1 for Multimodal LLMs Do Not Compose Skills Optimally Across Modalities
Figure 2 for Multimodal LLMs Do Not Compose Skills Optimally Across Modalities
Figure 3 for Multimodal LLMs Do Not Compose Skills Optimally Across Modalities
Figure 4 for Multimodal LLMs Do Not Compose Skills Optimally Across Modalities
Viaarxiv icon

Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque

Add code
Nov 12, 2025
Viaarxiv icon

Adding simple structure at inference improves Vision-Language Compositionality

Add code
Jun 11, 2025
Viaarxiv icon

Vision-Language Models Struggle to Align Entities across Modalities

Add code
Mar 05, 2025
Viaarxiv icon

Improving the Efficiency of Visually Augmented Language Models

Add code
Sep 17, 2024
Figure 1 for Improving the Efficiency of Visually Augmented Language Models
Figure 2 for Improving the Efficiency of Visually Augmented Language Models
Figure 3 for Improving the Efficiency of Visually Augmented Language Models
Figure 4 for Improving the Efficiency of Visually Augmented Language Models
Viaarxiv icon

BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval

Add code
Jun 14, 2024
Viaarxiv icon

BertaQA: How Much Do Language Models Know About Local Culture?

Add code
Jun 11, 2024
Figure 1 for BertaQA: How Much Do Language Models Know About Local Culture?
Figure 2 for BertaQA: How Much Do Language Models Know About Local Culture?
Figure 3 for BertaQA: How Much Do Language Models Know About Local Culture?
Figure 4 for BertaQA: How Much Do Language Models Know About Local Culture?
Viaarxiv icon

When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively

Add code
Apr 30, 2024
Figure 1 for When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Figure 2 for When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Figure 3 for When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Figure 4 for When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Viaarxiv icon

Grounding Spatial Relations in Text-Only Language Models

Add code
Mar 20, 2024
Figure 1 for Grounding Spatial Relations in Text-Only Language Models
Figure 2 for Grounding Spatial Relations in Text-Only Language Models
Figure 3 for Grounding Spatial Relations in Text-Only Language Models
Figure 4 for Grounding Spatial Relations in Text-Only Language Models
Viaarxiv icon

Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset

Add code
Mar 01, 2024
Figure 1 for Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset
Figure 2 for Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset
Figure 3 for Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset
Figure 4 for Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset
Viaarxiv icon