Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

James Ainooson

A Cognitively-Inspired Neural Architecture for Visual Abstract Reasoning Using Contrastive Perceptual and Conceptual Processing

Sep 21, 2023

Yuan Yang, Deepayan Sanyal, James Ainooson, Joel Michelson, Effat Farhana, Maithilee Kunda

Abstract:We introduce a new neural architecture for solving visual abstract reasoning tasks inspired by human cognition, specifically by observations that human abstract reasoning often interleaves perceptual and conceptual processing as part of a flexible, iterative, and dynamic cognitive process. Inspired by this principle, our architecture models visual abstract reasoning as an iterative, self-contrasting learning process that pursues consistency between perceptual and conceptual processing of visual stimuli. We explain how this new Contrastive Perceptual-Conceptual Network (CPCNet) works using matrix reasoning problems in the style of the well-known Raven's Progressive Matrices intelligence test. Experiments on the machine learning dataset RAVEN show that CPCNet achieves higher accuracy than all previously published models while also using the weakest inductive bias. We also point out a substantial and previously unremarked class imbalance in the original RAVEN dataset, and we propose a new variant of RAVEN -- AB-RAVEN -- that is more balanced in terms of abstract concepts.

Via

Access Paper or Ask Questions

A Computational Account Of Self-Supervised Visual Learning From Egocentric Object Play

May 30, 2023

Deepayan Sanyal, Joel Michelson, Yuan Yang, James Ainooson, Maithilee Kunda

Abstract:Research in child development has shown that embodied experience handling physical objects contributes to many cognitive abilities, including visual learning. One characteristic of such experience is that the learner sees the same object from several different viewpoints. In this paper, we study how learning signals that equate different viewpoints -- e.g., assigning similar representations to different views of a single object -- can support robust visual learning. We use the Toybox dataset, which contains egocentric videos of humans manipulating different objects, and conduct experiments using a computer vision framework for self-supervised contrastive learning. We find that representations learned by equating different physical viewpoints of an object benefit downstream image classification accuracy. Further experiments show that this performance improvement is robust to variations in the gaps between viewpoints, and that the benefits transfer to several different image classification tasks.

Via

Access Paper or Ask Questions

An Approach for Solving Tasks on the Abstract Reasoning Corpus

Feb 18, 2023

James Ainooson, Deepayan Sanyal, Joel P. Michelson, Yuan Yang, Maithilee Kunda

Abstract:The Abstract Reasoning Corpus (ARC) is an intelligence tests for measuring fluid intelligence in artificial intelligence systems and humans alike. In this paper we present a system for reasoning about and solving ARC tasks. Our system relies on a program synthesis approach that searches a space of potential programs for ones that can solve tasks from the ARC. Programs are in a domain specific language, and in some instances our search algorithm is guided by insights from a corpus of ground truth programs. In particular: We describe an imperative style domain specific language, called Visual Imagery Reasoning Language (VIMRL), for reasoning about tasks in the ARC. We also demonstrate an innovative approach for how large search spaces can be decomposed using special high level functions that determine their own arguments through local searches on a given task item. Finally, we share our results obtained on the publicly available ARC items as well as our system's strong performance on a private test, recently tying for 4th place on the global ARCathon 2022 challenge.

Via

Access Paper or Ask Questions

Deep Non-Monotonic Reasoning for Visual Abstract Reasoning Tasks

Feb 08, 2023

Yuan Yang, Deepayan Sanyal, Joel Michelson, James Ainooson, Maithilee Kunda

Abstract:While achieving unmatched performance on many well-defined tasks, deep learning models have also been used to solve visual abstract reasoning tasks, which are relatively less well-defined, and have been widely used to measure human intelligence. However, current deep models struggle to match human abilities to solve such tasks with minimum data but maximum generalization. One limitation is that current deep learning models work in a monotonic way, i.e., treating different parts of the input in essentially fixed orderings, whereas people repeatedly observe and reason about the different parts of the visual stimuli until the reasoning process converges to a consistent conclusion, i.e., non-monotonic reasoning. This paper proposes a non-monotonic computational approach to solve visual abstract reasoning tasks. In particular, we implemented a deep learning model using this approach and tested it on the RAVEN dataset -- a dataset inspired by the Raven's Progressive Matrices test. Results show that the proposed approach is more effective than existing monotonic deep learning models, under strict experimental settings that represent a difficult variant of the RAVEN dataset problem.

Via

Access Paper or Ask Questions

Automatic Item Generation of Figural Analogy Problems: A Review and Outlook

Jan 20, 2022

Yuan Yang, Deepayan Sanyal, Joel Michelson, James Ainooson, Maithilee Kunda

Figure 1 for Automatic Item Generation of Figural Analogy Problems: A Review and Outlook

Figure 2 for Automatic Item Generation of Figural Analogy Problems: A Review and Outlook

Figure 3 for Automatic Item Generation of Figural Analogy Problems: A Review and Outlook

Figure 4 for Automatic Item Generation of Figural Analogy Problems: A Review and Outlook

Abstract:Figural analogy problems have long been a widely used format in human intelligence tests. In the past four decades, more and more research has investigated automatic item generation for figural analogy problems, i.e., algorithmic approaches for systematically and automatically creating such problems. In cognitive science and psychometrics, this research can deepen our understandings of human analogical ability and psychometric properties of figural analogies. With the recent development of data-driven AI models for reasoning about figural analogies, the territory of automatic item generation of figural analogies has further expanded. This expansion brings new challenges as well as opportunities, which demand reflection on previous item generation research and planning future studies. This paper reviews the important works of automatic item generation of figural analogies for both human intelligence tests and data-driven AI models. From an interdisciplinary perspective, the principles and technical details of these works are analyzed and compared, and desiderata for future research are suggested.

* Presented at The Ninth Advances in Cognitive Systems (ACS) Conference 2021 (arXiv:2201.06134)

Via

Access Paper or Ask Questions

Variable-Viewpoint Representations for 3D Object Recognition

Feb 08, 2020

Tengyu Ma, Joel Michelson, James Ainooson, Deepayan Sanyal, Xiaohan Wang, Maithilee Kunda

Figure 1 for Variable-Viewpoint Representations for 3D Object Recognition

Figure 2 for Variable-Viewpoint Representations for 3D Object Recognition

Figure 3 for Variable-Viewpoint Representations for 3D Object Recognition

Figure 4 for Variable-Viewpoint Representations for 3D Object Recognition

Abstract:For the problem of 3D object recognition, researchers using deep learning methods have developed several very different input representations, including "multi-view" snapshots taken from discrete viewpoints around an object, as well as "spherical" representations consisting of a dense map of essentially ray-traced samples of the object from all directions. These representations offer trade-offs in terms of what object information is captured and to what degree of detail it is captured, but it is not clear how to measure these information trade-offs since the two types of representations are so different. We demonstrate that both types of representations in fact exist at two extremes of a common representational continuum, essentially choosing to prioritize either the number of views of an object or the pixels (i.e., field of view) allotted per view. We identify interesting intermediate representations that lie at points in between these two extremes, and we show, through systematic empirical experiments, how accuracy varies along this continuum as a function of input information as well as the particular deep learning architecture that is used.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

Quantifying Human Behavior on the Block Design Test Through Automated Multi-Level Analysis of Overhead Video

Nov 19, 2018

Seunghwan Cha, James Ainooson, Maithilee Kunda

Figure 1 for Quantifying Human Behavior on the Block Design Test Through Automated Multi-Level Analysis of Overhead Video

Figure 2 for Quantifying Human Behavior on the Block Design Test Through Automated Multi-Level Analysis of Overhead Video

Figure 3 for Quantifying Human Behavior on the Block Design Test Through Automated Multi-Level Analysis of Overhead Video

Figure 4 for Quantifying Human Behavior on the Block Design Test Through Automated Multi-Level Analysis of Overhead Video

Abstract:The block design test is a standardized, widely used neuropsychological assessment of visuospatial reasoning that involves a person recreating a series of given designs out of a set of colored blocks. In current testing procedures, an expert neuropsychologist observes a person's accuracy and completion time as well as overall impressions of the person's problem-solving procedures, errors, etc., thus obtaining a holistic though subjective and often qualitative view of the person's cognitive processes. We propose a new framework that combines room sensors and AI techniques to augment the information available to neuropsychologists from block design and similar tabletop assessments. In particular, a ceiling-mounted camera captures an overhead view of the table surface. From this video, we demonstrate how automated classification using machine learning can produce a frame-level description of the state of the block task and the person's actions over the course of each test problem. We also show how a sequence-comparison algorithm can classify one individual's problem-solving strategy relative to a database of simulated strategies, and how these quantitative results can be visualized for use by neuropsychologists.

Via

Access Paper or Ask Questions

Seeing Neural Networks Through a Box of Toys: The Toybox Dataset of Visual Object Transformations

Jul 31, 2018

Xiaohan Wang, Tengyu Ma, James Ainooson, Seunghwan Cha, Xiaotian Wang, Azhar Molla, Maithilee Kunda

Figure 1 for Seeing Neural Networks Through a Box of Toys: The Toybox Dataset of Visual Object Transformations

Figure 2 for Seeing Neural Networks Through a Box of Toys: The Toybox Dataset of Visual Object Transformations

Figure 3 for Seeing Neural Networks Through a Box of Toys: The Toybox Dataset of Visual Object Transformations

Figure 4 for Seeing Neural Networks Through a Box of Toys: The Toybox Dataset of Visual Object Transformations

Abstract:Deep convolutional neural networks (CNNs) have enjoyed tremendous success in computer vision in the past several years, particularly for visual object recognition.However, how CNNs work remains poorly understood, and the training of deep CNNs is still considered more art than science. To better characterize deep CNNs and the training process, we introduce a new video dataset called Toybox. Images in Toybox come from first-person, wearable camera recordings of common household objects and toys being manually manipulated to undergo structured transformations like rotations and translations. We also present results from initial experiments using deep CNNs that begin to examine how different distributions of training data can affect visual object recognition performance, and how visual object concepts are represented within a trained network.

Via

Access Paper or Ask Questions