Abstract:Professional networks provide invaluable entree to opportunity through referrals and introductions. A rich literature shows they also serve to entrench and even exacerbate a status quo of privilege and disadvantage. Hiring platforms, equipped with the ability to nudge link formation, provide a tantalizing opening for beneficial structural change. We anticipate that key to this prospect will be the ability to estimate the likelihood of edge formation in an evolving graph. Outcome-indistinguishable prediction algorithms ensure that the modeled world is indistinguishable from the real world by a family of statistical tests. Omnipredictors ensure that predictions can be post-processed to yield loss minimization competitive with respect to a benchmark class of predictors for many losses simultaneously, with appropriate post-processing. We begin by observing that, by combining a slightly modified form of the online K29 star algorithm of Vovk (2007) with basic facts from the theory of reproducing kernel Hilbert spaces, one can derive simple and efficient online algorithms satisfying outcome indistinguishability and omniprediction, with guarantees that improve upon, or are complementary to, those currently known. This is of independent interest. We apply these techniques to evolving graphs, obtaining online outcome-indistinguishable omnipredictors for rich -- possibly infinite -- sets of distinguishers that capture properties of pairs of nodes, and their neighborhoods. This yields, inter alia, multicalibrated predictions of edge formation with respect to pairs of demographic groups, and the ability to simultaneously optimize loss as measured by a variety of social welfare functions.
Abstract:Accurate bot detection is necessary for the safety and integrity of online platforms. It is also crucial for research on the influence of bots in elections, the spread of misinformation, and financial market manipulation. Platforms deploy infrastructure to flag or remove automated accounts, but their tools and data are not publicly available. Thus, the public must rely on third-party bot detection. These tools employ machine learning and often achieve near perfect performance for classification on existing datasets, suggesting bot detection is accurate, reliable and fit for use in downstream applications. We provide evidence that this is not the case and show that high performance is attributable to limitations in dataset collection and labeling rather than sophistication of the tools. Specifically, we show that simple decision rules -- shallow decision trees trained on a small number of features -- achieve near-state-of-the-art performance on most available datasets and that bot detection datasets, even when combined together, do not generalize well to out-of-sample datasets. Our findings reveal that predictions are highly dependent on each dataset's collection and labeling procedures rather than fundamental differences between bots and humans. These results have important implications for both transparency in sampling and labeling procedures and potential biases in research using existing bot detection tools for pre-processing.
Abstract:A robust body of evidence demonstrates the adverse effects of implicit bias in various contexts--from hiring to health care. The Rooney Rule is an intervention developed to counter implicit bias and has been implemented in the private and public sectors. The Rooney Rule requires that a selection panel include at least one candidate from an underrepresented group in their shortlist of candidates. Recently, Kleinberg and Raghavan proposed a model of implicit bias and studied the effectiveness of the Rooney Rule when applied to a single selection decision. However, selection decisions often occur repeatedly over time. Further, it has been observed that, given consistent counterstereotypical feedback, implicit biases against underrepresented candidates can change. We consider a model of how a selection panel's implicit bias changes over time given their hiring decisions either with or without the Rooney Rule in place. Our main result is that, when the panel is constrained by the Rooney Rule, their implicit bias roughly reduces at a rate that is the inverse of the size of the shortlist--independent of the number of candidates, whereas without the Rooney Rule, the rate is inversely proportional to the number of candidates. Thus, when the number of candidates is much larger than the size of the shortlist, the Rooney Rule enables a faster reduction in implicit bias, providing an additional reason in favor of using it as a strategy to mitigate implicit bias. Towards empirically evaluating the long-term effect of the Rooney Rule in repeated selection decisions, we conduct an iterative candidate selection experiment on Amazon MTurk. We observe that, indeed, decision-makers subject to the Rooney Rule select more minority candidates in addition to those required by the rule itself than they would if no rule is in effect, and do so without considerably decreasing the utility of candidates selected.