Abstract:Machine learning algorithms are extensively used to make increasingly more consequential decisions, so that achieving optimal predictive performance can no longer be the only focus. This paper explores intersectional fairness, that is fairness when intersections of multiple sensitive attributes -- such as race, age, nationality, etc. -- are considered. Previous research has mainly been focusing on fairness with respect to a single sensitive attribute, with intersectional fairness being comparatively less studied despite its critical importance for modern machine learning applications. We introduce intersectional fairness metrics by extending prior work, and provide different methodologies to audit discrimination in a given dataset or model outputs. Secondly, we develop novel post-processing techniques to mitigate any detected bias in a classification model. Our proposed methodology does not rely on any assumptions regarding the underlying model and aims at guaranteeing fairness while preserving good predictive performance. Finally, we give guidance on a practical implementation, showing how the proposed methods perform on a real-world dataset.
Abstract:We consider the complexity of equivalence and learning for multiplicity tree automata, i.e., weighted tree automata over a field. We first show that the equivalence problem is logspace equivalent to polynomial identity testing, the complexity of which is a longstanding open problem. Secondly, we derive lower bounds on the number of queries needed to learn multiplicity tree automata in Angluin's exact learning model, over both arbitrary and fixed fields. Habrard and Oncina (2006) give an exact learning algorithm for multiplicity tree automata, in which the number of queries is proportional to the size of the target automaton and the size of a largest counterexample, represented as a tree, that is returned by the Teacher. However, the smallest tree-counterexample may be exponential in the size of the target automaton. Thus the above algorithm does not run in time polynomial in the size of the target automaton, and has query complexity exponential in the lower bound. Assuming a Teacher that returns minimal DAG representations of counterexamples, we give a new exact learning algorithm whose query complexity is quadratic in the target automaton size, almost matching the lower bound, and improving the best previously-known algorithm by an exponential factor.