Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation

Feb 15, 2023

Joshua Vendrow, Saachi Jain, Logan Engstrom, Aleksander Madry

Share this with someone who'll enjoy it:

Abstract:Distribution shifts are a major source of failure of deployed machine learning models. However, evaluating a model's reliability under distribution shifts can be challenging, especially since it may be difficult to acquire counterfactual examples that exhibit a specified shift. In this work, we introduce dataset interfaces: a framework which allows users to scalably synthesize such counterfactual examples from a given dataset. Specifically, we represent each class from the input dataset as a custom token within the text space of a text-to-image diffusion model. By incorporating these tokens into natural language prompts, we can then generate instantiations of objects in that dataset under desired distribution shifts. We demonstrate how applying our framework to the ImageNet dataset enables us to study model behavior across a diverse array of shifts, including variations in background, lighting, and attributes of the objects themselves. Code available at https://github.com/MadryLab/dataset-interfaces.

View paper on

Share this with someone who'll enjoy it:

Title:Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation

Paper and Code