Picture for Alessandro Suglia

Alessandro Suglia

CROPE: Evaluating In-Context Adaptation of Vision and Language Models to Culture-Specific Concepts

Add code
Oct 20, 2024
Viaarxiv icon

Repairs in a Block World: A New Benchmark for Handling User Corrections with Multi-Modal Language Models

Add code
Sep 21, 2024
Viaarxiv icon

Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling

Add code
Sep 09, 2024
Figure 1 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Figure 2 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Figure 3 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Figure 4 for Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Viaarxiv icon

Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks

Add code
Jul 04, 2024
Figure 1 for Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks
Figure 2 for Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks
Figure 3 for Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks
Figure 4 for Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks
Viaarxiv icon

Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation

Add code
Jun 27, 2024
Figure 1 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Figure 2 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Figure 3 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Figure 4 for Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation
Viaarxiv icon

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Add code
Jun 26, 2024
Figure 1 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 2 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 3 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 4 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Viaarxiv icon

AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding

Add code
Jun 19, 2024
Viaarxiv icon

Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks

Add code
May 07, 2024
Figure 1 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Figure 2 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Figure 3 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Figure 4 for Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks
Viaarxiv icon

Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers

Add code
Apr 21, 2024
Viaarxiv icon

PIXAR: Auto-Regressive Language Modeling in Pixel Space

Add code
Jan 06, 2024
Viaarxiv icon