Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Motoaki Kawanabe

SPD Learning for Covariance-Based Neuroimaging Analysis: Perspectives, Methods, and Challenges

Apr 26, 2025

Ce Ju, Reinmar J. Kobler, Antoine Collas, Motoaki Kawanabe, Cuntai Guan, Bertrand Thirion

Abstract:Neuroimaging provides a critical framework for characterizing brain activity by quantifying connectivity patterns and functional architecture across modalities. While modern machine learning has significantly advanced our understanding of neural processing mechanisms through these datasets, decoding task-specific signatures must contend with inherent neuroimaging constraints, for example, low signal-to-noise ratios in raw electrophysiological recordings, cross-session non-stationarity, and limited sample sizes. This review focuses on machine learning approaches for covariance-based neuroimaging data, where often symmetric positive definite (SPD) matrices under full-rank conditions encode inter-channel relationships. By equipping the space of SPD matrices with Riemannian metrics (e.g., affine-invariant or log-Euclidean), their space forms a Riemannian manifold enabling geometric analysis. We unify methodologies operating on this manifold under the SPD learning framework, which systematically leverages the SPD manifold's geometry to process covariance features, thereby advancing brain imaging analytics.

* 20 pages, 3 figures, 2 tables; This paper has been submitted for possible publication, and currently under review

Via

Access Paper or Ask Questions

Answerability Fields: Answerable Location Estimation via Diffusion Models

Jul 26, 2024

Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Koya Sakamoto, Motoaki Kawanabe

Figure 1 for Answerability Fields: Answerable Location Estimation via Diffusion Models

Figure 2 for Answerability Fields: Answerable Location Estimation via Diffusion Models

Figure 3 for Answerability Fields: Answerable Location Estimation via Diffusion Models

Figure 4 for Answerability Fields: Answerable Location Estimation via Diffusion Models

Abstract:In an era characterized by advancements in artificial intelligence and robotics, enabling machines to interact with and understand their environment is a critical research endeavor. In this paper, we propose Answerability Fields, a novel approach to predicting answerability within complex indoor environments. Leveraging a 3D question answering dataset, we construct a comprehensive Answerability Fields dataset, encompassing diverse scenes and questions from ScanNet. Using a diffusion model, we successfully infer and evaluate these Answerability Fields, demonstrating the importance of objects and their locations in answering questions within a scene. Our results showcase the efficacy of Answerability Fields in guiding scene-understanding tasks, laying the foundation for their application in enhancing interactions between intelligent agents and their environments.

* IROS2024

Via

Access Paper or Ask Questions

Map-based Modular Approach for Zero-shot Embodied Question Answering

May 26, 2024

Koya Sakamoto, Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe

Abstract:Building robots capable of interacting with humans through natural language in the visual world presents a significant challenge in the field of robotics. To overcome this challenge, Embodied Question Answering (EQA) has been proposed as a benchmark task to measure the ability to identify an object navigating through a previously unseen environment in response to human-posed questions. Although some methods have been proposed, their evaluations have been limited to simulations, without experiments in real-world scenarios. Furthermore, all of these methods are constrained by a limited vocabulary for question-and-answer interactions, making them unsuitable for practical applications. In this work, we propose a map-based modular EQA method that enables real robots to navigate unknown environments through frontier-based map creation and address unknown QA pairs using foundation models that support open vocabulary. Unlike the questions of the previous EQA dataset on Matterport 3D (MP3D), questions in our real-world experiments contain various question formats and vocabularies not included in the training data. We conduct comprehensive experiments on virtual environments (MP3D-EQA) and two real-world house environments and demonstrate that our method can perform EQA even in the real world.

Via

Access Paper or Ask Questions

CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data

Oct 28, 2023

Taiki Miyanishi, Fumiya Kitamori, Shuhei Kurita, Jungdae Lee, Motoaki Kawanabe, Nakamasa Inoue

Abstract:City-scale 3D point cloud is a promising way to express detailed and complicated outdoor structures. It encompasses both the appearance and geometry features of segmented city components, including cars, streets, and buildings, that can be utilized for attractive applications such as user-interactive navigation of autonomous vehicles and drones. However, compared to the extensive text annotations available for images and indoor scenes, the scarcity of text annotations for outdoor scenes poses a significant challenge for achieving these applications. To tackle this problem, we introduce the CityRefer dataset for city-level visual grounding. The dataset consists of 35k natural language descriptions of 3D objects appearing in SensatUrban city scenes and 5k landmarks labels synchronizing with OpenStreetMap. To ensure the quality and accuracy of the dataset, all descriptions and labels in the CityRefer dataset are manually verified. We also have developed a baseline system that can learn encoded language descriptions, 3D object instances, and geographical information about the city's landmarks to perform visual grounding on the CityRefer dataset. To the best of our knowledge, the CityRefer dataset is the largest city-level visual grounding dataset for localizing specific 3D objects.

* NeurIPS D&B 2023. The first two authors are equally contributed

Via

Access Paper or Ask Questions

SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Jun 02, 2022

Reinmar J Kobler, Jun-ichiro Hirayama, Qibin Zhao, Motoaki Kawanabe

Figure 1 for SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Figure 2 for SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Figure 3 for SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Figure 4 for SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Abstract:Electroencephalography (EEG) provides access to neuronal dynamics non-invasively with millisecond resolution, rendering it a viable method in neuroscience and healthcare. However, its utility is limited as current EEG technology does not generalize well across domains (i.e., sessions and subjects) without expensive supervised re-calibration. Contemporary methods cast this transfer learning (TL) problem as a multi-source/-target unsupervised domain adaptation (UDA) problem and address it with deep learning or shallow, Riemannian geometry aware alignment methods. Both directions have, so far, failed to consistently close the performance gap to state-of-the-art domain-specific methods based on tangent space mapping (TSM) on the symmetric positive definite (SPD) manifold. Here, we propose a theory-based machine learning framework that enables, for the first time, learning domain-invariant TSM models in an end-to-end fashion. To achieve this, we propose a new building block for geometric deep learning, which we denote SPD domain-specific momentum batch normalization (SPDDSMBN). A SPDDSMBN layer can transform domain-specific SPD inputs into domain-invariant SPD outputs, and can be readily applied to multi-source/-target and online UDA scenarios. In extensive experiments with 6 diverse EEG brain-computer interface (BCI) datasets, we obtain state-of-the-art performance in inter-session and -subject TL with a simple, intrinsically interpretable network architecture, which we denote TSMNet.

* 9 pages, submitted to NeurIPS 2022

Via

Access Paper or Ask Questions

On the interpretation of linear Riemannian tangent space model parameters in M/EEG

Jul 30, 2021

Reinmar J. Kobler, Jun-Ichiro Hirayama, Lea Hehenberger Catarina Lopes-Dias, Gernot R. Müller-Putz, Motoaki Kawanabe

Figure 1 for On the interpretation of linear Riemannian tangent space model parameters in M/EEG

Figure 2 for On the interpretation of linear Riemannian tangent space model parameters in M/EEG

Figure 3 for On the interpretation of linear Riemannian tangent space model parameters in M/EEG

Figure 4 for On the interpretation of linear Riemannian tangent space model parameters in M/EEG

Abstract:Riemannian tangent space methods offer state-of-the-art performance in magnetoencephalography (MEG) and electroencephalography (EEG) based applications such as brain-computer interfaces and biomarker development. One limitation, particularly relevant for biomarker development, is limited model interpretability compared to established component-based methods. Here, we propose a method to transform the parameters of linear tangent space models into interpretable patterns. Using typical assumptions, we show that this approach identifies the true patterns of latent sources, encoding a target signal. In simulations and two real MEG and EEG datasets, we demonstrate the validity of the proposed approach and investigate its behavior when the model assumptions are violated. Our results confirm that Riemannian tangent space methods are robust to differences in the source patterns across observations. We found that this robustness property also transfers to the associated patterns.

Via

Access Paper or Ask Questions

Insights from Classifying Visual Concepts with Multiple Kernel Learning

Dec 16, 2011

Alexander Binder, Shinichi Nakajima, Marius Kloft, Christina Müller, Wojciech Samek, Ulf Brefeld, Klaus-Robert Müller, Motoaki Kawanabe

Figure 1 for Insights from Classifying Visual Concepts with Multiple Kernel Learning

Figure 2 for Insights from Classifying Visual Concepts with Multiple Kernel Learning

Figure 3 for Insights from Classifying Visual Concepts with Multiple Kernel Learning

Figure 4 for Insights from Classifying Visual Concepts with Multiple Kernel Learning

Abstract:Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, so-called 1-norm MKL variants are often observed to be outperformed by an unweighted sum kernel. The contribution of this paper is twofold: We apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks within computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum kernel SVM and the sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. About to be submitted to PLoS ONE.

* PLoS ONE 7(8): e38897, 2012
* 18 pages, 8 tables, 4 figures, format deviating from plos one submission format requirements for aesthetic reasons

Via

Access Paper or Ask Questions

Modeling sparse connectivity between underlying brain sources for EEG/MEG

Dec 12, 2009

Stefan Haufe, Ryota Tomioka, Guido Nolte, Klaus-Robert Mueller, Motoaki Kawanabe

Figure 1 for Modeling sparse connectivity between underlying brain sources for EEG/MEG

Figure 2 for Modeling sparse connectivity between underlying brain sources for EEG/MEG

Figure 3 for Modeling sparse connectivity between underlying brain sources for EEG/MEG

Figure 4 for Modeling sparse connectivity between underlying brain sources for EEG/MEG

Abstract:We propose a novel technique to assess functional brain connectivity in EEG/MEG signals. Our method, called Sparsely-Connected Sources Analysis (SCSA), can overcome the problem of volume conduction by modeling neural data innovatively with the following ingredients: (a) the EEG is assumed to be a linear mixture of correlated sources following a multivariate autoregressive (MVAR) model, (b) the demixing is estimated jointly with the source MVAR parameters, (c) overfitting is avoided by using the Group Lasso penalty. This approach allows to extract the appropriate level cross-talk between the extracted sources and in this manner we obtain a sparse data-driven model of functional connectivity. We demonstrate the usefulness of SCSA with simulated data, and compare to a number of existing algorithms with excellent results.

* IEEE Trans. Biomed. Eng. 57(8) (2010) 1954 - 1963;
* 9 pages, 6 figures

Via

Access Paper or Ask Questions

How to Explain Individual Classification Decisions

Dec 06, 2009

David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, Klaus-Robert Mueller

Figure 1 for How to Explain Individual Classification Decisions

Figure 2 for How to Explain Individual Classification Decisions

Figure 3 for How to Explain Individual Classification Decisions

Figure 4 for How to Explain Individual Classification Decisions

Abstract:After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted the particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.

* 31 pages, 14 figures

Via

Access Paper or Ask Questions