Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julia Stoyanovich

SHAP-based Explanations are Sensitive to Feature Representation

May 13, 2025

Hyunseung Hwang, Andrew Bell, Joao Fonseca, Venetia Pliatsika, Julia Stoyanovich, Steven Euijong Whang

Abstract:Local feature-based explanations are a key component of the XAI toolkit. These explanations compute feature importance values relative to an ``interpretable'' feature representation. In tabular data, feature values themselves are often considered interpretable. This paper examines the impact of data engineering choices on local feature-based explanations. We demonstrate that simple, common data engineering techniques, such as representing age with a histogram or encoding race in a specific way, can manipulate feature importance as determined by popular methods like SHAP. Notably, the sensitivity of explanations to feature representation can be exploited by adversaries to obscure issues like discrimination. While the intuition behind these results is straightforward, their systematic exploration has been lacking. Previous work has focused on adversarial attacks on feature-based explainers by biasing data or manipulating models. To the best of our knowledge, this is the first study demonstrating that explainers can be misled by standard, seemingly innocuous data engineering techniques.

* Accepted to ACM FAccT 2025

Via

Access Paper or Ask Questions

Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data

Apr 19, 2025

Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich

Abstract:Differentially private (DP) machine learning often relies on the availability of public data for tasks like privacy-utility trade-off estimation, hyperparameter tuning, and pretraining. While public data assumptions may be reasonable in text and image domains, they are less likely to hold for tabular data due to tabular data heterogeneity across domains. We propose leveraging powerful priors to address this limitation; specifically, we synthesize realistic tabular data directly from schema-level specifications - such as variable names, types, and permissible ranges - without ever accessing sensitive records. To that end, this work introduces the notion of "surrogate" public data - datasets generated independently of sensitive data, which consume no privacy loss budget and are constructed solely from publicly available schema or metadata. Surrogate public data are intended to encode plausible statistical assumptions (informed by publicly available information) into a dataset with many downstream uses in private mechanisms. We automate the process of generating surrogate public data with large language models (LLMs); in particular, we propose two methods: direct record generation as CSV files, and automated structural causal model (SCM) construction for sampling records. Through extensive experiments, we demonstrate that surrogate public tabular data can effectively replace traditional public data when pretraining differentially private tabular classifiers. To a lesser extent, surrogate public data are also useful for hyperparameter tuning of DP synthetic data generators, and for estimating the privacy-utility tradeoff.

Via

Access Paper or Ask Questions

CREDAL: Close Reading of Data Models

Feb 11, 2025

George Fletcher, Olha Nahurna, Matvii Prytula, Julia Stoyanovich

Figure 1 for CREDAL: Close Reading of Data Models

Figure 2 for CREDAL: Close Reading of Data Models

Figure 3 for CREDAL: Close Reading of Data Models

Figure 4 for CREDAL: Close Reading of Data Models

Abstract:Data models are necessary for the birth of data and of any data-driven system. Indeed, every algorithm, every machine learning model, every statistical model, and every database has an underlying data model without which the system would not be usable. Hence, data models are excellent sites for interrogating the (material, social, political, ...) conditions giving rise to a data system. Towards this, drawing inspiration from literary criticism, we propose to closely read data models in the same spirit as we closely read literary artifacts. Close readings of data models reconnect us with, among other things, the materiality, the genealogies, the techne, the closed nature, and the design of technical systems. While recognizing from literary theory that there is no one correct way to read, it is nonetheless critical to have systematic guidance for those unfamiliar with close readings. This is especially true for those trained in the computing and data sciences, who too often are enculturated to set aside the socio-political aspects of data work. A systematic methodology for reading data models currently does not exist. To fill this gap, we present the CREDAL methodology for close readings of data models. We detail our iterative development process and present results of a qualitative evaluation of CREDAL demonstrating its usability, usefulness, and effectiveness in the critical study of data.

Via

Access Paper or Ask Questions

Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs

Jan 02, 2025

Joao Fonseca, Andrew Bell, Julia Stoyanovich

Figure 1 for Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs

Figure 2 for Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs

Figure 3 for Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs

Figure 4 for Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs

Abstract:Large Language Models (LLMs) have been shown to be susceptible to jailbreak attacks, or adversarial attacks used to illicit high risk behavior from a model. Jailbreaks have been exploited by cybercriminals and blackhat actors to cause significant harm, highlighting the critical need to safeguard widely-deployed models. Safeguarding approaches, which include fine-tuning models or having LLMs "self-reflect", may lengthen the inference time of a model, incur a computational penalty, reduce the semantic fluency of an output, and restrict ``normal'' model behavior. Importantly, these Safety-Performance Trade-offs (SPTs) remain an understudied area. In this work, we introduce a novel safeguard, called SafeNudge, that combines Controlled Text Generation with "nudging", or using text interventions to change the behavior of a model. SafeNudge triggers during text-generation while a jailbreak attack is being executed, and can reduce successful jailbreak attempts by 30% by guiding the LLM towards a safe responses. It adds minimal latency to inference and has a negligible impact on the semantic fluency of outputs. Further, we allow for tunable SPTs. SafeNudge is open-source and available through https://pypi.org/, and is compatible with models loaded with the Hugging Face "transformers" library.

Via

Access Paper or Ask Questions

Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

Dec 19, 2024

Andrew Bell, Julia Stoyanovich

Figure 1 for Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

Figure 2 for Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

Figure 3 for Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

Figure 4 for Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

Abstract:Concerns about the risks and harms posed by artificial intelligence (AI) have resulted in significant study into algorithmic transparency, giving rise to a sub-field known as Explainable AI (XAI). Unfortunately, despite a decade of development in XAI, an existential challenge remains: progress in research has not been fully translated into the actual implementation of algorithmic transparency by organizations. In this work, we test an approach for addressing the challenge by creating transparency advocates, or motivated individuals within organizations who drive a ground-up cultural shift towards improved algorithmic transparency. Over several years, we created an open-source educational workshop on algorithmic transparency and advocacy. We delivered the workshop to professionals across two separate domains to improve their algorithmic transparency literacy and willingness to advocate for change. In the weeks following the workshop, participants applied what they learned, such as speaking up for algorithmic transparency at an organization-wide AI strategy meeting. We also make two broader observations: first, advocacy is not a monolith and can be broken down into different levels. Second, individuals' willingness for advocacy is affected by their professional field. For example, news and media professionals may be more likely to advocate for algorithmic transparency than those working at technology start-ups.

Via

Access Paper or Ask Questions

Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation

Sep 11, 2024

Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich

Figure 1 for Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation

Figure 2 for Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation

Figure 3 for Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation

Figure 4 for Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation

Abstract:We present Shades-of-NULL, a benchmark for responsible missing value imputation. Our benchmark includes state-of-the-art imputation techniques, and embeds them into the machine learning development lifecycle. We model realistic missingness scenarios that go beyond Rubin's classic Missing Completely at Random (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR), to include multi-mechanism missingness (when different missingness patterns co-exist in the data) and missingness shift (when the missingness mechanism changes between training and test). Another key novelty of our work is that we evaluate imputers holistically, based on the predictive performance, fairness and stability of the models that are trained and tested on the data they produce. We use Shades-of-NULL to conduct a large-scale empirical study involving 20,952 experimental pipelines, and find that, while there is no single best-performing imputation approach for all missingness types, interesting performance patterns do emerge when comparing imputer performance in simpler vs. more complex missingness scenarios. Further, while predictive performance, fairness and stability can be seen as orthogonal, we identify trade-offs among them that arise due to the combination of missingness scenario, the choice of an imputer, and the architecture of the model trained on the data post-imputation. We make Shades-of-NULL publicly available, and hope to enable researchers to comprehensively and rigorously evaluate new missing value imputation methods on a wide range of evaluation metrics, in plausible and socially meaningful missingness scenarios.

Via

Access Paper or Ask Questions

ShaRP: Explaining Rankings with Shapley Values

Jan 30, 2024

Venetia Pliatsika, Joao Fonseca, Tilun Wang, Julia Stoyanovich

Figure 1 for ShaRP: Explaining Rankings with Shapley Values

Figure 2 for ShaRP: Explaining Rankings with Shapley Values

Figure 3 for ShaRP: Explaining Rankings with Shapley Values

Figure 4 for ShaRP: Explaining Rankings with Shapley Values

Abstract:Algorithmic decisions in critical domains such as hiring, college admissions, and lending are often based on rankings. Because of the impact these decisions have on individuals, organizations, and population groups, there is a need to understand them: to know whether the decisions are abiding by the law, to help individuals improve their rankings, and to design better ranking procedures. In this paper, we present ShaRP (Shapley for Rankings and Preferences), a framework that explains the contributions of features to different aspects of a ranked outcome, and is based on Shapley values. Using ShaRP, we show that even when the scoring function used by an algorithmic ranker is known and linear, the weight of each feature does not correspond to its Shapley value contribution. The contributions instead depend on the feature distributions, and on the subtle local interactions between the scoring features. ShaRP builds on the Quantitative Input Influence framework, and can compute the contributions of features for multiple Quantities of Interest, including score, rank, pair-wise preference, and top-k. Because it relies on black-box access to the ranker, ShaRP can be used to explain both score-based and learned ranking models. We show results of an extensive experimental validation of ShaRP using real and synthetic datasets, showcasing its usefulness for qualitative analysis.

* 8 pages

Via

Access Paper or Ask Questions

Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

Jan 29, 2024

Andrew Bell, Joao Fonseca, Carlo Abrate, Francesco Bonchi, Julia Stoyanovich

Figure 1 for Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

Figure 2 for Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

Figure 3 for Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

Figure 4 for Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

Abstract:Algorithmic recourse -- providing recommendations to those affected negatively by the outcome of an algorithmic system on how they can take action and change that outcome -- has gained attention as a means of giving persons agency in their interactions with artificial intelligence (AI) systems. Recent work has shown that even if an AI decision-making classifier is ``fair'' (according to some reasonable criteria), recourse itself may be unfair due to differences in the initial circumstances of individuals, compounding disparities for marginalized populations and requiring them to exert more effort than others. There is a need to define more methods and metrics for evaluating fairness in recourse that span a range of normative views of the world, and specifically those that take into account time. Time is a critical element in recourse because the longer it takes an individual to act, the more the setting may change due to model or data drift. This paper seeks to close this research gap by proposing two notions of fairness in recourse that are in normative alignment with substantive equality of opportunity, and that consider time. The first considers the (often repeated) effort individuals exert per successful recourse event, and the second considers time per successful recourse event. Building upon an agent-based framework for simulating recourse, this paper demonstrates how much effort is needed to overcome disparities in initial circumstances. We then proposes an intervention to improve the fairness of recourse by rewarding effort, and compare it to existing strategies.

Via

Access Paper or Ask Questions

A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

Jan 25, 2024

Lucius E. J. Bynum, Joshua R. Loftus, Julia Stoyanovich

Figure 1 for A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

Figure 2 for A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

Figure 3 for A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

Figure 4 for A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

Abstract:Counterfactuals and counterfactual reasoning underpin numerous techniques for auditing and understanding artificial intelligence (AI) systems. The traditional paradigm for counterfactual reasoning in this literature is the interventional counterfactual, where hypothetical interventions are imagined and simulated. For this reason, the starting point for causal reasoning about legal protections and demographic data in AI is an imagined intervention on a legally-protected characteristic, such as ethnicity, race, gender, disability, age, etc. We ask, for example, what would have happened had your race been different? An inherent limitation of this paradigm is that some demographic interventions -- like interventions on race -- may not translate into the formalisms of interventional counterfactuals. In this work, we explore a new paradigm based instead on the backtracking counterfactual, where rather than imagine hypothetical interventions on legally-protected characteristics, we imagine alternate initial conditions while holding these characteristics fixed. We ask instead, what would explain a counterfactual outcome for you as you actually are or could be? This alternate framework allows us to address many of the same social concerns, but to do so while asking fundamentally different questions that do not rely on demographic interventions.

Via

Access Paper or Ask Questions

A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy

Dec 18, 2023

Lucas Rosenblatt, Julia Stoyanovich, Christopher Musco

Figure 1 for A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy

Figure 2 for A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy

Figure 3 for A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy

Figure 4 for A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy

Abstract:Differentially private (DP) mechanisms have been deployed in a variety of high-impact social settings (perhaps most notably by the U.S. Census). Since all DP mechanisms involve adding noise to results of statistical queries, they are expected to impact our ability to accurately analyze and learn from data, in effect trading off privacy with utility. Alarmingly, the impact of DP on utility can vary significantly among different sub-populations. A simple way to reduce this disparity is with stratification. First compute an independent private estimate for each group in the data set (which may be the intersection of several protected classes), then, to compute estimates of global statistics, appropriately recombine these group estimates. Our main observation is that naive stratification often yields high-accuracy estimates of population-level statistics, without the need for additional privacy budget. We support this observation theoretically and empirically. Our theoretical results center on the private mean estimation problem, while our empirical results center on extensive experiments on private data synthesis to demonstrate the effectiveness of stratification on a variety of private mechanisms. Overall, we argue that this straightforward approach provides a strong baseline against which future work on reducing utility disparities of DP mechanisms should be compared.

Via

Access Paper or Ask Questions