Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aili Shen

Systematic Evaluation of Predictive Fairness

Oct 17, 2022

Xudong Han, Aili Shen, Trevor Cohn, Timothy Baldwin, Lea Frermann

Figure 1 for Systematic Evaluation of Predictive Fairness

Figure 2 for Systematic Evaluation of Predictive Fairness

Figure 3 for Systematic Evaluation of Predictive Fairness

Figure 4 for Systematic Evaluation of Predictive Fairness

Abstract:Mitigating bias in training on biased datasets is an important open problem. Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of target class imbalance and stereotyping is under-studied. To address this gap, we examine the performance of various debiasing methods across multiple tasks, spanning binary classification (Twitter sentiment), multi-class classification (profession prediction), and regression (valence prediction). Through extensive experimentation, we find that data conditions have a strong influence on relative model performance, and that general conclusions cannot be drawn about method efficacy when evaluating only on standard datasets, as is current practice in fairness research.

* AACL 2022

Via

Access Paper or Ask Questions

Optimising Equal Opportunity Fairness in Model Training

May 05, 2022

Aili Shen, Xudong Han, Trevor Cohn, Timothy Baldwin, Lea Frermann

Figure 1 for Optimising Equal Opportunity Fairness in Model Training

Figure 2 for Optimising Equal Opportunity Fairness in Model Training

Figure 3 for Optimising Equal Opportunity Fairness in Model Training

Figure 4 for Optimising Equal Opportunity Fairness in Model Training

Abstract:Real-world datasets often encode stereotypes and societal biases. Such biases can be implicitly captured by trained models, leading to biased predictions and exacerbating existing societal preconceptions. Existing debiasing methods, such as adversarial training and removing protected information from representations, have been shown to reduce bias. However, a disconnect between fairness criteria and training objectives makes it difficult to reason theoretically about the effectiveness of different techniques. In this work, we propose two novel training objectives which directly optimise for the widely-used criterion of {\it equal opportunity}, and show that they are effective in reducing bias while maintaining high performance over two classification tasks.

* Accepted to NAACL 2022 main conference

Via

Access Paper or Ask Questions

fairlib: A Unified Framework for Assessing and Improving Classification Fairness

May 04, 2022

Xudong Han, Aili Shen, Yitong Li, Lea Frermann, Timothy Baldwin, Trevor Cohn

Figure 1 for fairlib: A Unified Framework for Assessing and Improving Classification Fairness

Figure 2 for fairlib: A Unified Framework for Assessing and Improving Classification Fairness

Figure 3 for fairlib: A Unified Framework for Assessing and Improving Classification Fairness

Figure 4 for fairlib: A Unified Framework for Assessing and Improving Classification Fairness

Abstract:This paper presents fairlib, an open-source framework for assessing and improving classification fairness. It provides a systematic framework for quickly reproducing existing baseline models, developing new methods, evaluating models with different metrics, and visualizing their results. Its modularity and extensibility enable the framework to be used for diverse types of inputs, including natural language, images, and audio. In detail, we implement 14 debiasing methods, including pre-processing, at-training-time, and post-processing approaches. The built-in metrics cover the most commonly used fairness criterion and can be further generalized and customized for fairness evaluation.

* pre-print, 9 pages

Via

Access Paper or Ask Questions

Contrastive Learning for Fair Representations

Sep 22, 2021

Aili Shen, Xudong Han, Trevor Cohn, Timothy Baldwin, Lea Frermann

Figure 1 for Contrastive Learning for Fair Representations

Figure 2 for Contrastive Learning for Fair Representations

Figure 3 for Contrastive Learning for Fair Representations

Figure 4 for Contrastive Learning for Fair Representations

Abstract:Trained classification models can unintentionally lead to biased representations and predictions, which can reinforce societal preconceptions and stereotypes. Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise. In this paper, we propose a method for mitigating bias in classifier training by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations, while instances sharing a protected attribute are forced further apart. In such a way our method learns representations which capture the task label in focused regions, while ensuring the protected attribute has diverse spread, and thus has limited impact on prediction and thereby results in fairer models. Extensive experimental results across four tasks in NLP and computer vision show (a) that our proposed method can achieve fairer representations and realises bias reductions compared with competitive baselines; and (b) that it can do so without sacrificing main task performance; (c) that it sets a new state-of-the-art performance in one task despite reducing the bias. Finally, our method is conceptually simple and agnostic to network architectures, and incurs minimal additional compute cost.

Via

Access Paper or Ask Questions

Evaluating Document Coherence Modelling

Mar 18, 2021

Aili Shen, Meladel Mistica, Bahar Salehi, Hang Li, Timothy Baldwin, Jianzhong Qi

Figure 1 for Evaluating Document Coherence Modelling

Figure 2 for Evaluating Document Coherence Modelling

Figure 3 for Evaluating Document Coherence Modelling

Figure 4 for Evaluating Document Coherence Modelling

Abstract:While pretrained language models ("LM") have driven impressive gains over morpho-syntactic and semantic tasks, their ability to model discourse and pragmatic phenomena is less clear. As a step towards a better understanding of their discourse modelling capabilities, we propose a sentence intrusion detection task. We examine the performance of a broad range of pretrained LMs on this detection task for English. Lacking a dataset for the task, we introduce INSteD, a novel intruder sentence detection dataset, containing 170,000+ documents constructed from English Wikipedia and CNN news articles. Our experiments show that pretrained LMs perform impressively in in-domain evaluation, but experience a substantial drop in the cross-domain setting, indicating limited generalisation capacity. Further results over a novel linguistic probe dataset show that there is substantial room for improvement, especially in the cross-domain setting.

* accepted to TACL 2021

Via

Access Paper or Ask Questions

A Joint Model for Multimodal Document Quality Assessment

Jan 14, 2019

Aili Shen, Bahar Salehi, Timothy Baldwin, Jianzhong Qi

Figure 1 for A Joint Model for Multimodal Document Quality Assessment

Figure 2 for A Joint Model for Multimodal Document Quality Assessment

Figure 3 for A Joint Model for Multimodal Document Quality Assessment

Figure 4 for A Joint Model for Multimodal Document Quality Assessment

Abstract:The quality of a document is affected by various factors, including grammaticality, readability, stylistics, and expertise depth, making the task of document quality assessment a complex one. In this paper, we explore this task in the context of assessing the quality of Wikipedia articles and academic papers. Observing that the visual rendering of a document can capture implicit quality indicators that are not present in the document text --- such as images, font choices, and visual layout --- we propose a joint model that combines the text content with a visual rendering of the document for document quality assessment. Experimental results over two datasets reveal that textual and visual features are complementary, achieving state-of-the-art results.

Via

Access Paper or Ask Questions