Picture for Lisa Dunlap

Lisa Dunlap

Visually Prompted Benchmarks Are Surprisingly Fragile

Add code
Dec 19, 2025
Figure 1 for Visually Prompted Benchmarks Are Surprisingly Fragile
Figure 2 for Visually Prompted Benchmarks Are Surprisingly Fragile
Figure 3 for Visually Prompted Benchmarks Are Surprisingly Fragile
Figure 4 for Visually Prompted Benchmarks Are Surprisingly Fragile
Viaarxiv icon

Interpretable Embeddings with Sparse Autoencoders: A Data Analysis Toolkit

Add code
Dec 10, 2025
Viaarxiv icon

Discovering Divergent Representations between Text-to-Image Models

Add code
Sep 10, 2025
Figure 1 for Discovering Divergent Representations between Text-to-Image Models
Figure 2 for Discovering Divergent Representations between Text-to-Image Models
Figure 3 for Discovering Divergent Representations between Text-to-Image Models
Figure 4 for Discovering Divergent Representations between Text-to-Image Models
Viaarxiv icon

Video Action Differencing

Add code
Mar 10, 2025
Viaarxiv icon

VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Add code
Dec 11, 2024
Figure 1 for VisionArena: 230K Real World User-VLM Conversations with Preference Labels
Figure 2 for VisionArena: 230K Real World User-VLM Conversations with Preference Labels
Figure 3 for VisionArena: 230K Real World User-VLM Conversations with Preference Labels
Figure 4 for VisionArena: 230K Real World User-VLM Conversations with Preference Labels
Viaarxiv icon

VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models

Add code
Oct 10, 2024
Figure 1 for VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
Figure 2 for VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
Figure 3 for VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
Figure 4 for VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
Viaarxiv icon

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Add code
Jun 17, 2024
Figure 1 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Figure 2 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Figure 3 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Figure 4 for From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Viaarxiv icon

See, Say, and Segment: Teaching LMMs to Overcome False Premises

Add code
Dec 13, 2023
Figure 1 for See, Say, and Segment: Teaching LMMs to Overcome False Premises
Figure 2 for See, Say, and Segment: Teaching LMMs to Overcome False Premises
Figure 3 for See, Say, and Segment: Teaching LMMs to Overcome False Premises
Figure 4 for See, Say, and Segment: Teaching LMMs to Overcome False Premises
Viaarxiv icon

Describing Differences in Image Sets with Natural Language

Add code
Dec 05, 2023
Figure 1 for Describing Differences in Image Sets with Natural Language
Figure 2 for Describing Differences in Image Sets with Natural Language
Figure 3 for Describing Differences in Image Sets with Natural Language
Figure 4 for Describing Differences in Image Sets with Natural Language
Viaarxiv icon

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

Add code
May 25, 2023
Figure 1 for Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Figure 2 for Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Figure 3 for Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Figure 4 for Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Viaarxiv icon