Picture for Spandana Gella

Spandana Gella

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Add code
Mar 19, 2025
Viaarxiv icon

SafeArena: Evaluating the Safety of Autonomous Web Agents

Add code
Mar 06, 2025
Viaarxiv icon

PairBench: A Systematic Framework for Selecting Reliable Judge VLMs

Add code
Feb 21, 2025
Viaarxiv icon

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Add code
Feb 03, 2025
Figure 1 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 2 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 3 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 4 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Viaarxiv icon

FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering

Add code
Dec 09, 2024
Figure 1 for FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering
Figure 2 for FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering
Figure 3 for FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering
Figure 4 for FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering
Viaarxiv icon

"What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge

Add code
May 20, 2023
Figure 1 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Figure 2 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Figure 3 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Figure 4 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Viaarxiv icon

Multimodal Contextualized Plan Prediction for Embodied Task Completion

Add code
May 10, 2023
Viaarxiv icon

Using In-Context Learning to Improve Dialogue Safety

Add code
Feb 02, 2023
Figure 1 for Using In-Context Learning to Improve Dialogue Safety
Figure 2 for Using In-Context Learning to Improve Dialogue Safety
Figure 3 for Using In-Context Learning to Improve Dialogue Safety
Figure 4 for Using In-Context Learning to Improve Dialogue Safety
Viaarxiv icon

DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines

Add code
Dec 20, 2022
Viaarxiv icon

Dialog Acts for Task-Driven Embodied Agents

Add code
Sep 26, 2022
Figure 1 for Dialog Acts for Task-Driven Embodied Agents
Figure 2 for Dialog Acts for Task-Driven Embodied Agents
Figure 3 for Dialog Acts for Task-Driven Embodied Agents
Figure 4 for Dialog Acts for Task-Driven Embodied Agents
Viaarxiv icon