Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Toon Van de Maele

AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models

May 30, 2025

Conor Heins, Toon Van de Maele, Alexander Tschantz, Hampus Linander, Dimitrije Markovic, Tommaso Salvatori, Corrado Pezzato, Ozan Catal, Ran Wei, Magnus Koudahl(+4 more)

Abstract:Current deep reinforcement learning (DRL) approaches achieve state-of-the-art performance in various domains, but struggle with data efficiency compared to human learning, which leverages core priors about objects and their interactions. Active inference offers a principled framework for integrating sensory information with prior knowledge to learn a world model and quantify the uncertainty of its own beliefs and predictions. However, active inference models are usually crafted for a single task with bespoke knowledge, so they lack the domain flexibility typical of DRL approaches. To bridge this gap, we propose a novel architecture that integrates a minimal yet expressive set of core priors about object-centric dynamics and interactions to accelerate learning in low-data regimes. The resulting approach, which we call AXIOM, combines the usual data efficiency and interpretability of Bayesian approaches with the across-task generalization usually associated with DRL. AXIOM represents scenes as compositions of objects, whose dynamics are modeled as piecewise linear trajectories that capture sparse object-object interactions. The structure of the generative model is expanded online by growing and learning mixture models from single events and periodically refined through Bayesian model reduction to induce generalization. AXIOM masters various games within only 10,000 interaction steps, with both a small number of parameters compared to DRL, and without the computational expense of gradient-based optimization.

* 10 pages main text, 4 figures, 2 tables; 25 pages supplementary material, 8 figures

Via

Access Paper or Ask Questions

Variational Bayes Gaussian Splatting

Oct 04, 2024

Toon Van de Maele, Ozan Catal, Alexander Tschantz, Christopher L. Buckley, Tim Verbelen

Figure 1 for Variational Bayes Gaussian Splatting

Figure 2 for Variational Bayes Gaussian Splatting

Figure 3 for Variational Bayes Gaussian Splatting

Figure 4 for Variational Bayes Gaussian Splatting

Abstract:Recently, 3D Gaussian Splatting has emerged as a promising approach for modeling 3D scenes using mixtures of Gaussians. The predominant optimization method for these models relies on backpropagating gradients through a differentiable rendering pipeline, which struggles with catastrophic forgetting when dealing with continuous streams of data. To address this limitation, we propose Variational Bayes Gaussian Splatting (VBGS), a novel approach that frames training a Gaussian splat as variational inference over model parameters. By leveraging the conjugacy properties of multivariate Gaussians, we derive a closed-form variational update rule, allowing efficient updates from partial, sequential observations without the need for replay buffers. Our experiments show that VBGS not only matches state-of-the-art performance on static datasets, but also enables continual learning from sequentially streamed 2D and 3D data, drastically improving performance in this setting.

Via

Access Paper or Ask Questions

Belief sharing: a blessing or a curse

Jul 02, 2024

Ozan Catal, Toon Van de Maele, Riddhi J. Pitliya, Mahault Albarracin, Candice Pattisapu, Tim Verbelen

Abstract:When collaborating with multiple parties, communicating relevant information is of utmost importance to efficiently completing the tasks at hand. Under active inference, communication can be cast as sharing beliefs between free-energy minimizing agents, where one agent's beliefs get transformed into an observation modality for the other. However, the best approach for transforming beliefs into observations remains an open question. In this paper, we demonstrate that naively sharing posterior beliefs can give rise to the negative social dynamics of echo chambers and self-doubt. We propose an alternate belief sharing strategy which mitigates these issues.

Via

Access Paper or Ask Questions

Learning Spatial and Temporal Hierarchies: Hierarchical Active Inference for navigation in Multi-Room Maze Environments

Sep 18, 2023

Daria de Tinguy, Toon Van de Maele, Tim Verbelen, Bart Dhoedt

Abstract:Cognitive maps play a crucial role in facilitating flexible behaviour by representing spatial and conceptual relationships within an environment. The ability to learn and infer the underlying structure of the environment is crucial for effective exploration and navigation. This paper introduces a hierarchical active inference model addressing the challenge of inferring structure in the world from pixel-based observations. We propose a three-layer hierarchical model consisting of a cognitive map, an allocentric, and an egocentric world model, combining curiosity-driven exploration with goal-oriented behaviour at the different levels of reasoning from context to place to motion. This allows for efficient exploration and goal-directed search in room-structured mini-grid environments.

* IROS 2023 Workshop World Models and Predictive Coding in Cognitive Robotics. arXiv admin note: text overlap with arXiv:2306.13546

Via

Access Paper or Ask Questions

Integrating cognitive map learning and active inference for planning in ambiguous environments

Aug 16, 2023

Toon Van de Maele, Bart Dhoedt, Tim Verbelen, Giovanni Pezzulo

Abstract:Living organisms need to acquire both cognitive maps for learning the structure of the world and planning mechanisms able to deal with the challenges of navigating ambiguous environments. Although significant progress has been made in each of these areas independently, the best way to integrate them is an open research question. In this paper, we propose the integration of a statistical model of cognitive map formation within an active inference agent that supports planning under uncertainty. Specifically, we examine the clone-structured cognitive graph (CSCG) model of cognitive map formation and compare a naive clone graph agent with an active inference-driven clone graph agent, in three spatial navigation scenarios. Our findings demonstrate that while both agents are effective in simple scenarios, the active inference agent is more effective when planning in challenging scenarios, in which sensory observations provide ambiguous information about location.

* Accepted at IWAI 2023 (International Workshop on Active Inference)

Via

Access Paper or Ask Questions

Inferring Hierarchical Structure in Multi-Room Maze Environments

Jun 23, 2023

Daria de Tinguy, Toon Van de Maele, Tim Verbelen, Bart Dhoedt

Figure 1 for Inferring Hierarchical Structure in Multi-Room Maze Environments

Figure 2 for Inferring Hierarchical Structure in Multi-Room Maze Environments

Figure 3 for Inferring Hierarchical Structure in Multi-Room Maze Environments

Figure 4 for Inferring Hierarchical Structure in Multi-Room Maze Environments

* ICML 2023 Workshop

Via

Access Paper or Ask Questions

Symmetry and Complexity in Object-Centric Deep Active Inference Models

Apr 14, 2023

Stefano Ferraro, Toon Van de Maele, Tim Verbelen, Bart Dhoedt

Abstract:Humans perceive and interact with hundreds of objects every day. In doing so, they need to employ mental models of these objects and often exploit symmetries in the object's shape and appearance in order to learn generalizable and transferable skills. Active inference is a first principles approach to understanding and modeling sentient agents. It states that agents entertain a generative model of their environment, and learn and act by minimizing an upper bound on their surprisal, i.e. their Free Energy. The Free Energy decomposes into an accuracy and complexity term, meaning that agents favor the least complex model, that can accurately explain their sensory observations. In this paper, we investigate how inherent symmetries of particular objects also emerge as symmetries in the latent state space of the generative model learnt under deep active inference. In particular, we focus on object-centric representations, which are trained from pixels to predict novel object views as the agent moves its viewpoint. First, we investigate the relation between model complexity and symmetry exploitation in the state space. Second, we do a principal component analysis to demonstrate how the model encodes the principal axis of symmetry of the object in the latent space. Finally, we also demonstrate how more symmetrical representations can be exploited for better generalization in the context of manipulation.

Via

Access Paper or Ask Questions

Object-Centric Scene Representations using Active Inference

Feb 07, 2023

Toon Van de Maele, Tim Verbelen, Pietro Mazzaglia, Stefano Ferraro, Bart Dhoedt

Abstract:Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this paper, we propose a novel approach for scene understanding, leveraging a hierarchical object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and outperforms both supervised and reinforcement learning baselines by a large margin.

Via

Access Paper or Ask Questions

FMCW Radar Sensing for Indoor Drones Using Learned Representations

Jan 06, 2023

Ali Safa, Tim Verbelen, Ozan Catal, Toon Van de Maele, Matthias Hartmann, Bart Dhoedt, André Bourdoux

Abstract:Frequency-modulated continuous-wave (FMCW) radar is a promising sensor technology for indoor drones as it provides range, angular as well as Doppler-velocity information about obstacles in the environment. Recently, deep learning approaches have been proposed for processing FMCW data, outperforming traditional detection techniques on range-Doppler or range-azimuth maps. However, these techniques come at a cost; for each novel task a deep neural network architecture has to be trained on high-dimensional input data, stressing both data bandwidth and processing budget. In this paper, we investigate unsupervised learning techniques that generate low-dimensional representations from FMCW radar data, and evaluate to what extent these representations can be reused for multiple downstream tasks. To this end, we introduce a novel dataset of raw radar ADC data recorded from a radar mounted on a flying drone platform in an indoor environment, together with ground truth detection targets. We show with real radar data that, utilizing our learned representations, we match the performance of conventional radar processing techniques and that our model can be trained on different input modalities such as raw ADC samples of only two consecutively transmitted chirps.

Via

Access Paper or Ask Questions

Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Sep 16, 2022

Stefano Ferraro, Toon Van de Maele, Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt

Figure 1 for Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Figure 2 for Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Figure 3 for Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Figure 4 for Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Abstract:Active inference is a first principles approach for understanding the brain in particular, and sentient agents in general, with the single imperative of minimizing free energy. As such, it provides a computational account for modelling artificial intelligent agents, by defining the agent's generative model and inferring the model parameters, actions and hidden state beliefs. However, the exact specification of the generative model and the hidden state space structure is left to the experimenter, whose design choices influence the resulting behaviour of the agent. Recently, deep learning methods have been proposed to learn a hidden state space structure purely from data, alleviating the experimenter from this tedious design task, but resulting in an entangled, non-interpreteable state space. In this paper, we hypothesize that such a learnt, entangled state space does not necessarily yield the best model in terms of free energy, and that enforcing different factors in the state space can yield a lower model complexity. In particular, we consider the problem of 3D object representation, and focus on different instances of the ShapeNet dataset. We propose a model that factorizes object shape, pose and category, while still learning a representation for each factor using a deep neural network. We show that models, with best disentanglement properties, perform best when adopted by an active agent in reaching preferred observations.

Via

Access Paper or Ask Questions