Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adriano Barbosa-Silva

Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Mar 09, 2022

Adriano Barbosa-Silva, Simon Ott, Kathrin Blagec, Jan Brauner, Matthias Samwald

Figure 1 for Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Figure 2 for Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Figure 3 for Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Figure 4 for Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Abstract:Benchmarks are crucial to measuring and steering progress in artificial intelligence (AI). However, recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, we introduce methodologies for creating condensed maps of the global dynamics of benchmark creation and saturation. We curated data for 1688 benchmarks covering the entire domains of computer vision and natural language processing, and show that a large fraction of benchmarks quickly trended towards near-saturation, that many benchmarks fail to find widespread utilization, and that benchmark performance gains for different AI tasks were prone to unforeseen bursts. We conclude that future work should focus on large-scale community collaboration and on mapping benchmark performance gains to real-world utility and impact of AI.

Via

Access Paper or Ask Questions

A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

Oct 06, 2021

Kathrin Blagec, Adriano Barbosa-Silva, Simon Ott, Matthias Samwald

Figure 1 for A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

Figure 2 for A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

Figure 3 for A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

Figure 4 for A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

Abstract:Research in artificial intelligence (AI) is addressing a growing number of tasks through a rapidly growing number of models and methodologies. This makes it difficult to keep track of where novel AI methods are successfully -- or still unsuccessfully -- applied, how progress is measured, how different advances might synergize with each other, and how future research should be prioritized. To help address these issues, we created the Intelligence Task Ontology and Knowledge Graph (ITO), a comprehensive, richly structured and manually curated resource on artificial intelligence tasks, benchmark results and performance metrics. The current version of ITO contain 685,560 edges, 1,100 classes representing AI processes and 1,995 properties representing performance metrics. The goal of ITO is to enable precise and network-based analyses of the global landscape of AI tasks and capabilities. ITO is based on technologies that allow for easy integration and enrichment with external data, automated inference and continuous, collaborative expert curation of underlying ontological models. We make the ITO dataset and a collection of Jupyter notebooks utilising ITO openly available.

Via

Access Paper or Ask Questions