Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Danaja Rutar

Cognitive Science-Inspired Evaluation of Core Capabilities for Object Understanding in AI

Mar 27, 2025

Danaja Rutar, Alva Markelius, Konstantinos Voudouris, José Hernández-Orallo, Lucy Cheke

Abstract:One of the core components of our world models is 'intuitive physics' - an understanding of objects, space, and causality. This capability enables us to predict events, plan action and navigate environments, all of which rely on a composite sense of objecthood. Despite its importance, there is no single, unified account of objecthood, though multiple theoretical frameworks provide insights. In the first part of this paper, we present a comprehensive overview of the main theoretical frameworks in objecthood research - Gestalt psychology, enactive cognition, and developmental psychology - and identify the core capabilities each framework attributes to object understanding, as well as what functional roles they play in shaping world models in biological agents. Given the foundational role of objecthood in world modelling, understanding objecthood is also essential in AI. In the second part of the paper, we evaluate how current AI paradigms approach and test objecthood capabilities compared to those in cognitive science. We define an AI paradigm as a combination of how objecthood is conceptualised, the methods used for studying objecthood, the data utilised, and the evaluation techniques. We find that, whilst benchmarks can detect that AI systems model isolated aspects of objecthood, the benchmarks cannot detect when AI systems lack functional integration across these capabilities, not solving the objecthood challenge fully. Finally, we explore novel evaluation approaches that align with the integrated vision of objecthood outlined in this paper. These methods are promising candidates for advancing from isolated object capabilities toward general-purpose AI with genuine object understanding in real-world contexts.

Via

Access Paper or Ask Questions

Predictable Artificial Intelligence

Oct 09, 2023

Lexin Zhou, Pablo A. Moreno-Casares, Fernando Martínez-Plumed, John Burden, Ryan Burnell, Lucy Cheke, Cèsar Ferri, Alexandru Marcoci, Behzad Mehrbakhsh, Yael Moros-Daval(+5 more)

Figure 1 for Predictable Artificial Intelligence

Figure 2 for Predictable Artificial Intelligence

Figure 3 for Predictable Artificial Intelligence

Figure 4 for Predictable Artificial Intelligence

Abstract:We introduce the fundamental ideas and challenges of Predictable AI, a nascent research area that explores the ways in which we can anticipate key indicators of present and future AI ecosystems. We argue that achieving predictability is crucial for fostering trust, liability, control, alignment and safety of AI ecosystems, and thus should be prioritised over performance. While distinctive from other areas of technical and non-technical AI research, the questions, hypotheses and challenges relevant to Predictable AI were yet to be clearly described. This paper aims to elucidate them, calls for identifying paths towards AI predictability and outlines the potential impact of this emergent field.

* 11 pages excluding references, 4 figures, and 2 tables. Paper Under Review

Via

Access Paper or Ask Questions

Inferring Capabilities from Task Performance with Bayesian Triangulation

Sep 21, 2023

John Burden, Konstantinos Voudouris, Ryan Burnell, Danaja Rutar, Lucy Cheke, José Hernández-Orallo

Figure 1 for Inferring Capabilities from Task Performance with Bayesian Triangulation

Figure 2 for Inferring Capabilities from Task Performance with Bayesian Triangulation

Figure 3 for Inferring Capabilities from Task Performance with Bayesian Triangulation

Figure 4 for Inferring Capabilities from Task Performance with Bayesian Triangulation

Abstract:As machine learning models become more general, we need to characterise them in richer, more meaningful ways. We describe a method to infer the cognitive profile of a system from diverse experimental data. To do so, we introduce measurement layouts that model how task-instance features interact with system capabilities to affect performance. These features must be triangulated in complex ways to be able to infer capabilities from non-populational data -- a challenge for traditional psychometric and inferential tools. Using the Bayesian probabilistic programming library PyMC, we infer different cognitive profiles for agents in two scenarios: 68 actual contestants in the AnimalAI Olympics and 30 synthetic agents for O-PIAAGETS, an object permanence battery. We showcase the potential for capability-oriented evaluation.

* 8 Pages + 14 pages of Appendices. 15 Figures. Submitted to AAAI 2024. Preprint

Via

Access Paper or Ask Questions