Abstract:The range of application of artificial intelligence (AI) is vast, as is the potential for harm. Growing awareness of potential risks from AI systems has spurred action to address those risks, while eroding confidence in AI systems and the organizations that develop them. A 2019 study found over 80 organizations that published and adopted "AI ethics principles'', and more have joined since. But the principles often leave a gap between the "what" and the "how" of trustworthy AI development. Such gaps have enabled questionable or ethically dubious behavior, which casts doubts on the trustworthiness of specific organizations, and the field more broadly. There is thus an urgent need for concrete methods that both enable AI developers to prevent harm and allow them to demonstrate their trustworthiness through verifiable behavior. Below, we explore mechanisms (drawn from arXiv:2004.07213) for creating an ecosystem where AI developers can earn trust - if they are trustworthy. Better assessment of developer trustworthiness could inform user choice, employee actions, investment decisions, legal recourse, and emerging governance regimes.
Abstract:The artificial intelligence community (AI) has recently engaged in activism in relation to their employers, other members of the community, and their governments in order to shape the societal and ethical implications of AI. It has achieved some notable successes, but prospects for further political organising and activism are uncertain. We survey activism by the AI community over the last six years; apply two analytical frameworks drawing upon the literature on epistemic communities, and worker organising and bargaining; and explore what they imply for the future prospects of the AI community. Success thus far has hinged on a coherent shared culture, and high bargaining power due to the high demand for a limited supply of AI talent. Both are crucial to the future of AI activism and worthy of sustained attention.
Abstract:This report surveys the landscape of potential security threats from malicious uses of AI, and proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the ways in which AI may influence the threat landscape in the digital, physical, and political domains, we make four high-level recommendations for AI researchers and other stakeholders. We also suggest several promising areas for further research that could expand the portfolio of defenses, or make attacks less effective or harder to execute. Finally, we discuss, but do not conclusively resolve, the long-term equilibrium of attackers and defenders.