Abstract:We propose a novel approach to image classification inspired by complex nonlinear biological visual processing, whereby classical convolutional neural networks (CNNs) are equipped with learnable higher-order convolutions. Our model incorporates a Volterra-like expansion of the convolution operator, capturing multiplicative interactions akin to those observed in early and advanced stages of biological visual processing. We evaluated this approach on synthetic datasets by measuring sensitivity to testing higher-order correlations and performance in standard benchmarks (MNIST, FashionMNIST, CIFAR10, CIFAR100 and Imagenette). Our architecture outperforms traditional CNN baselines, and achieves optimal performance with expansions up to 3rd/4th order, aligning remarkably well with the distribution of pixel intensities in natural images. Through systematic perturbation analysis, we validate this alignment by isolating the contributions of specific image statistics to model performance, demonstrating how different orders of convolution process distinct aspects of visual information. Furthermore, Representational Similarity Analysis reveals distinct geometries across network layers, indicating qualitatively different modes of visual information processing. Our work bridges neuroscience and deep learning, offering a path towards more effective, biologically inspired computer vision models. It provides insights into visual information processing and lays the groundwork for neural networks that better capture complex visual patterns, particularly in resource-constrained scenarios.
Abstract:Computational models of vision have traditionally been developed in a bottom-up fashion, by hierarchically composing a series of straightforward operations - i.e. convolution and pooling - with the aim of emulating simple and complex cells in the visual cortex, resulting in the introduction of deep convolutional neural networks (CNNs). Nevertheless, data obtained with recent neuronal recording techniques support that the nature of the computations carried out in the ventral visual stream is not completely captured by current deep CNN models. To fill the gap between the ventral visual stream and deep models, several benchmarks have been designed and organized into the Brain-Score platform, granting a way to perform multi-layer (V1, V2, V4, IT) and behavioral comparisons between the two counterparts. In our work, we aim to shift the focus on architectures that take into account lateral recurrent connections, a ubiquitous feature of the ventral visual stream, to devise adaptive receptive fields. Through recurrent connections, the input s long-range spatial dependencies can be captured in a local multi-step fashion and, as introduced with Gated Recurrent CNNs (GRCNN), the unbounded expansion of the neuron s receptive fields can be modulated through the use of gates. In order to increase the robustness of our approach and the biological fidelity of the activations, we employ specific data augmentation techniques in line with several of the scoring benchmarks. Enforcing some form of invariance, through heuristics, was found to be beneficial for better neural predictivity.
Abstract:Electrocorticogram (ECoG) well characterizes hand movement intentions and gestures. In the present work we aim to investigate the possibility to enhance hand pose classification, in a Rock-Paper-Scissor - and Rest - task, by introducing topological descriptors of time series data. We hypothesized that an innovative approach based on topological data analysis can extract hidden information that are not detectable with standard Brain Computer Interface (BCI)techniques. To investigate this hypothesis, we integrate topological features together with power band features and feed them to several standard classifiers, e.g. Random Forest,Gradient Boosting. Model selection is thus completed after a meticulous phase of bayesian hyperparameter optimization. With our method, we observed robust results in terms of ac-curacy for a four-labels classification problem, with limited available data. Through feature importance investigation, we conclude that topological descriptors are able to extract useful discriminative information and provide novel insights.Since our data are restricted to single-patient recordings, generalization might be limited. Nevertheless, our method can be extended and applied to a wide range of neurophysiological recordings and it might be an intriguing point of departure for future studies.
Abstract:The aim of this work is to evaluate the feasibility of re-implementing some key parts of the widely used Weather Research and Forecasting WRF-SFIRE simulator by replacing its core differential equations numerical solvers with state-of-the-art physics-informed machine learning techniques to solve ODEs and PDEs, in order to transform it into a real-time simulator for wildfire spread prediction. The main programming language used is Julia, a compiled language which offers better perfomance than interpreted ones, providing Just in Time (JIT) compilation with different optimization levels. Moreover, Julia is particularly well suited for numerical computation and for the solution of complex physical models, both considering the syntax and the presence of some specific libraries such as DifferentialEquations.jl and ModellingToolkit.jl.