Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexis Carlier

AI Ethics Statements -- Analysis and lessons learnt from NeurIPS Broader Impact Statements

Nov 02, 2021

Carolyn Ashurst, Emmie Hine, Paul Sedille, Alexis Carlier

Figure 1 for AI Ethics Statements -- Analysis and lessons learnt from NeurIPS Broader Impact Statements

Figure 2 for AI Ethics Statements -- Analysis and lessons learnt from NeurIPS Broader Impact Statements

Figure 3 for AI Ethics Statements -- Analysis and lessons learnt from NeurIPS Broader Impact Statements

Figure 4 for AI Ethics Statements -- Analysis and lessons learnt from NeurIPS Broader Impact Statements

Abstract:Ethics statements have been proposed as a mechanism to increase transparency and promote reflection on the societal impacts of published research. In 2020, the machine learning (ML) conference NeurIPS broke new ground by requiring that all papers include a broader impact statement. This requirement was removed in 2021, in favour of a checklist approach. The 2020 statements therefore provide a unique opportunity to learn from the broader impact experiment: to investigate the benefits and challenges of this and similar governance mechanisms, as well as providing an insight into how ML researchers think about the societal impacts of their own work. Such learning is needed as NeurIPS and other venues continue to question and adapt their policies. To enable this, we have created a dataset containing the impact statements from all NeurIPS 2020 papers, along with additional information such as affiliation type, location and subject area, and a simple visualisation tool for exploration. We also provide an initial quantitative analysis of the dataset, covering representation, engagement, common themes, and willingness to discuss potential harms alongside benefits. We investigate how these vary by geography, affiliation type and subject area. Drawing on these findings, we discuss the potential benefits and negative outcomes of ethics statement requirements, and their possible causes and associated challenges. These lead us to several lessons to be learnt from the 2020 requirement: (i) the importance of creating the right incentives, (ii) the need for clear expectations and guidance, and (iii) the importance of transparency and constructive deliberation. We encourage other researchers to use our dataset to provide additional analysis, to further our understanding of how researchers responded to this requirement, and to investigate the benefits and challenges of this and related mechanisms.

Via

Access Paper or Ask Questions

RAFT: A Real-World Few-Shot Text Classification Benchmark

Sep 28, 2021

Neel Alex, Eli Lifland, Lewis Tunstall, Abhishek Thakur, Pegah Maham, C. Jess Riedel, Emmie Hine, Carolyn Ashurst, Paul Sedille, Alexis Carlier(+2 more)

Figure 1 for RAFT: A Real-World Few-Shot Text Classification Benchmark

Figure 2 for RAFT: A Real-World Few-Shot Text Classification Benchmark

Figure 3 for RAFT: A Real-World Few-Shot Text Classification Benchmark

Figure 4 for RAFT: A Real-World Few-Shot Text Classification Benchmark

Abstract:Large pre-trained language models have shown promise for few-shot learning, completing text-based tasks given only a few task-specific examples. Will models soon solve classification tasks that have so far been reserved for human research assistants? Existing benchmarks are not designed to measure progress in applied settings, and so don't directly answer this question. The RAFT benchmark (Real-world Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. Baseline evaluations on RAFT reveal areas current techniques struggle with: reasoning over long texts and tasks with many classes. Human baselines show that some classification tasks are difficult for non-expert humans, reflecting that real-world value sometimes depends on domain expertise. Yet even non-expert human baseline F1 scores exceed GPT-3 by an average of 0.11. The RAFT datasets and leaderboard will track which model improvements translate into real-world benefits at https://raft.elicit.org .

* Dataset, submission instructions, code and leaderboard available at https://raft.elicit.org

Via

Access Paper or Ask Questions

Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods

Jun 03, 2021

Etienne David, Mario Serouart, Daniel Smith, Simon Madec, Kaaviya Velumani, Shouyang Liu, Xu Wang, Francisco Pinto Espinosa, Shahameh Shafiee, Izzat S. A. Tahir(+25 more)

Figure 1 for Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods

Figure 2 for Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods

Abstract:The Global Wheat Head Detection (GWHD) dataset was created in 2020 and has assembled 193,634 labelled wheat heads from 4,700 RGB images acquired from various acquisition platforms and 7 countries/institutions. With an associated competition hosted in Kaggle, GWHD has successfully attracted attention from both the computer vision and agricultural science communities. From this first experience in 2020, a few avenues for improvements have been identified, especially from the perspective of data size, head diversity and label reliability. To address these issues, the 2020 dataset has been reexamined, relabeled, and augmented by adding 1,722 images from 5 additional countries, allowing for 81,553 additional wheat heads to be added. We now release a new version of the Global Wheat Head Detection (GWHD) dataset in 2021, which is bigger, more diverse, and less noisy than the 2020 version. The GWHD 2021 is now publicly available at http://www.global-wheat.com/ and a new data challenge has been organized on AIcrowd to make use of this updated dataset.

* 8 pages, 2 figures, 1 table

Via

Access Paper or Ask Questions