Picture for Kate Saenko

Kate Saenko

Web Artifact Attacks Disrupt Vision Language Models

Add code
Mar 17, 2025
Viaarxiv icon

SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models

Add code
Feb 24, 2025
Viaarxiv icon

OP-LoRA: The Blessing of Dimensionality

Add code
Dec 13, 2024
Figure 1 for OP-LoRA: The Blessing of Dimensionality
Figure 2 for OP-LoRA: The Blessing of Dimensionality
Figure 3 for OP-LoRA: The Blessing of Dimensionality
Figure 4 for OP-LoRA: The Blessing of Dimensionality
Viaarxiv icon

SAT: Spatial Aptitude Training for Multimodal Language Models

Add code
Dec 10, 2024
Figure 1 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 2 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 3 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 4 for SAT: Spatial Aptitude Training for Multimodal Language Models
Viaarxiv icon

Is Large-Scale Pretraining the Secret to Good Domain Generalization?

Add code
Dec 03, 2024
Viaarxiv icon

KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models

Add code
Jul 25, 2024
Figure 1 for KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
Figure 2 for KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
Figure 3 for KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
Figure 4 for KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
Viaarxiv icon

Tell Me What's Next: Textual Foresight for Generic UI Representations

Add code
Jun 12, 2024
Viaarxiv icon

SLANT: Spurious Logo ANalysis Toolkit

Add code
Jun 03, 2024
Figure 1 for SLANT: Spurious Logo ANalysis Toolkit
Figure 2 for SLANT: Spurious Logo ANalysis Toolkit
Figure 3 for SLANT: Spurious Logo ANalysis Toolkit
Figure 4 for SLANT: Spurious Logo ANalysis Toolkit
Viaarxiv icon

An Introduction to Vision-Language Modeling

Add code
May 27, 2024
Figure 1 for An Introduction to Vision-Language Modeling
Figure 2 for An Introduction to Vision-Language Modeling
Figure 3 for An Introduction to Vision-Language Modeling
Viaarxiv icon

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models

Add code
Apr 21, 2024
Viaarxiv icon