Picture for Georgios Pantazopoulos

Georgios Pantazopoulos

An Efficient Training Pipeline for Reasoning Graphical User Interface Agents

Add code
Nov 14, 2025
Viaarxiv icon

Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users

Add code
Mar 28, 2025
Viaarxiv icon

CROPE: Evaluating In-Context Adaptation of Vision and Language Models to Culture-Specific Concepts

Add code
Oct 20, 2024
Viaarxiv icon

Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling

Add code
Sep 09, 2024
Figure 1 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Figure 2 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Figure 3 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Figure 4 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Viaarxiv icon

Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation

Add code
Jun 27, 2024
Figure 1 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Figure 2 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Figure 3 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Figure 4 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Viaarxiv icon

Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks

Add code
May 07, 2024
Figure 1 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Figure 2 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Figure 3 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Figure 4 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Viaarxiv icon

Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers

Add code
Apr 21, 2024
Viaarxiv icon

Multitask Multimodal Prompted Training for Interactive Embodied Task Completion

Add code
Nov 07, 2023
Figure 1 for Multitask Multimodal Prompted Training for Interactive Embodied Task Completion
Figure 2 for Multitask Multimodal Prompted Training for Interactive Embodied Task Completion
Figure 3 for Multitask Multimodal Prompted Training for Interactive Embodied Task Completion
Figure 4 for Multitask Multimodal Prompted Training for Interactive Embodied Task Completion
Viaarxiv icon