Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Siyang Li

SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding

May 26, 2025

Hongbin Wang, Zhihong Jia, Yuanzhong Shen, Ziwei Wang, Siyang Li, Kai Shu, Feng Hu, Dongrui Wu

Abstract:Speech disorders such as dysarthria and anarthria can severely impair the patient's ability to communicate verbally. Speech decoding brain-computer interfaces (BCIs) offer a potential alternative by directly translating speech intentions into spoken words, serving as speech neuroprostheses. This paper reports an experimental protocol for Mandarin Chinese speech decoding BCIs, along with the corresponding decoding algorithms. Stereo-electroencephalography (SEEG) and synchronized audio data were collected from eight drug-resistant epilepsy patients as they conducted a word-level reading task. The proposed SEEG and Audio Contrastive Matching (SACM), a contrastive learning-based framework, achieved decoding accuracies significantly exceeding chance levels in both speech detection and speech decoding tasks. Electrode-wise analysis revealed that a single sensorimotor cortex electrode achieved performance comparable to that of the full electrode array. These findings provide valuable insights for developing more accurate online speech decoding BCIs.

Via

Access Paper or Ask Questions

Spatial Distillation based Distribution Alignment (SDDA) for Cross-Headset EEG Classification

Mar 07, 2025

Dingkun Liu, Siyang Li, Ziwei Wang, Wei Li, Dongrui Wu

Abstract:A non-invasive brain-computer interface (BCI) enables direct interaction between the user and external devices, typically via electroencephalogram (EEG) signals. However, decoding EEG signals across different headsets remains a significant challenge due to differences in the number and locations of the electrodes. To address this challenge, we propose a spatial distillation based distribution alignment (SDDA) approach for heterogeneous cross-headset transfer in non-invasive BCIs. SDDA uses first spatial distillation to make use of the full set of electrodes, and then input/feature/output space distribution alignments to cope with the significant differences between the source and target domains. To our knowledge, this is the first work to use knowledge distillation in cross-headset transfers. Extensive experiments on six EEG datasets from two BCI paradigms demonstrated that SDDA achieved superior performance in both offline unsupervised domain adaptation and online supervised domain adaptation scenarios, consistently outperforming 10 classical and state-of-the-art transfer learning algorithms.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions

Multi-View Contrastive Network (MVCNet) for Motor Imagery Classification

Feb 27, 2025

Ziwei Wang, Siyang Li, Xiaoqing Chen, Wei Li, Dongrui Wu

Abstract:Objective: An electroencephalography (EEG)-based brain-computer interface (BCI) serves as a direct communication pathway between the human brain and an external device. While supervised learning has been extensively explored for motor imagery (MI) EEG classification, small data quantity has been a key factor limiting the performance of deep feature learning. Methods: This paper proposes a knowledge-driven time-space-frequency based multi-view contrastive network (MVCNet) for MI EEG decoding in BCIs. MVCNet integrates knowledge from the time, space, and frequency domains into the training process through data augmentations from multiple views, fostering more discriminative feature learning of the characteristics of EEG data. We introduce a cross-view contrasting module to learn from different augmented views and a cross-model contrasting module to enhance the consistency of features extracted between knowledge-guided and data-driven models. Results: The combination of EEG data augmentation strategies was systematically investigated for more informative supervised contrastive learning. Experiments on four public MI datasets and three different architectures demonstrated that MVCNet outperformed 10 existing approaches. Significance: Our approach can significantly boost EEG classification performance beyond designated networks, showcasing the potential to enhance the feature learning process for better EEG decoding.

* 9 pages, 7 figures

Via

Access Paper or Ask Questions

Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies

Feb 05, 2025

Siyang Li, Hongbin Wang, Xiaoqing Chen, Dongrui Wu

Figure 1 for Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies

Figure 2 for Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies

Figure 3 for Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies

Figure 4 for Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies

Abstract:Brain-computer interfaces (BCIs) enable direct communication between the brain and external devices. This review highlights the core decoding algorithms that enable multimodal BCIs, including a dissection of the elements, a unified view of diversified approaches, and a comprehensive analysis of the present state of the field. We emphasize algorithmic advancements in cross-modality mapping, sequential modeling, besides classic multi-modality fusion, illustrating how these novel AI approaches enhance decoding of brain data. The current literature of BCI applications on visual, speech, and affective decoding are comprehensively explored. Looking forward, we draw attention on the impact of emerging architectures like multimodal Transformers, and discuss challenges such as brain data heterogeneity and common errors. This review also serves as a bridge in this interdisciplinary field for experts with neuroscience background and experts that study AI, aiming to provide a comprehensive understanding for AI-powered multimodal BCIs.

Via

Access Paper or Ask Questions

Motor Imagery Classification for Asynchronous EEG-Based Brain-Computer Interfaces

Dec 12, 2024

Huanyu Wu, Siyang Li, Dongrui Wu

Abstract:Motor imagery (MI) based brain-computer interfaces (BCIs) enable the direct control of external devices through the imagined movements of various body parts. Unlike previous systems that used fixed-length EEG trials for MI decoding, asynchronous BCIs aim to detect the user's MI without explicit triggers. They are challenging to implement, because the algorithm needs to first distinguish between resting-states and MI trials, and then classify the MI trials into the correct task, all without any triggers. This paper proposes a sliding window prescreening and classification (SWPC) approach for MI-based asynchronous BCIs, which consists of two modules: a prescreening module to screen MI trials out of the resting-state, and a classification module for MI classification. Both modules are trained with supervised learning followed by self-supervised learning, which refines the feature extractors. Within-subject and cross-subject asynchronous MI classifications on four different EEG datasets validated the effectiveness of SWPC, i.e., it always achieved the highest average classification accuracy, and outperformed the best state-of-the-art baseline on each dataset by about 2%.

* IEEE Trans. on Neural Systems and Rehabilitation Engineering, 32:527-536, 2024

Via

Access Paper or Ask Questions

T-TIME: Test-Time Information Maximization Ensemble for Plug-and-Play BCIs

Dec 10, 2024

Siyang Li, Ziwei Wang, Hanbin Luo, Lieyun Ding, Dongrui Wu

Abstract:Objective: An electroencephalogram (EEG)-based brain-computer interface (BCI) enables direct communication between the human brain and a computer. Due to individual differences and non-stationarity of EEG signals, such BCIs usually require a subject-specific calibration session before each use, which is time-consuming and user-unfriendly. Transfer learning (TL) has been proposed to shorten or eliminate this calibration, but existing TL approaches mainly consider offline settings, where all unlabeled EEG trials from the new user are available. Methods: This paper proposes Test-Time Information Maximization Ensemble (T-TIME) to accommodate the most challenging online TL scenario, where unlabeled EEG data from the new user arrive in a stream, and immediate classification is performed. T-TIME initializes multiple classifiers from the aligned source data. When an unlabeled test EEG trial arrives, T-TIME first predicts its labels using ensemble learning, and then updates each classifier by conditional entropy minimization and adaptive marginal distribution regularization. Our code is publicized. Results: Extensive experiments on three public motor imagery based BCI datasets demonstrated that T-TIME outperformed about 20 classical and state-of-the-art TL approaches. Significance: To our knowledge, this is the first work on test time adaptation for calibration-free EEG-based BCIs, making plug-and-play BCIs possible.

* S. Li, Z. Wang, H. Luo, L. Ding and D. Wu, T-TIME: Test-Time Information Maximization Ensemble for Plug-and-Play BCIs, IEEE Trans. on Biomedical Engineering, 71(2):423-432, 2024

Via

Access Paper or Ask Questions

Channel Reflection: Knowledge-Driven Data Augmentation for EEG-Based Brain-Computer Interfaces

Dec 04, 2024

Ziwei Wang, Siyang Li, Jingwei Luo, Jiajing Liu, Dongrui Wu

Abstract:A brain-computer interface (BCI) enables direct communication between the human brain and external devices. Electroencephalography (EEG) based BCIs are currently the most popular for able-bodied users. To increase user-friendliness, usually a small amount of user-specific EEG data are used for calibration, which may not be enough to develop a pure data-driven decoding model. To cope with this typical calibration data shortage challenge in EEG-based BCIs, this paper proposes a parameter-free channel reflection (CR) data augmentation approach that incorporates prior knowledge on the channel distributions of different BCI paradigms in data augmentation. Experiments on eight public EEG datasets across four different BCI paradigms (motor imagery, steady-state visual evoked potential, P300, and seizure classifications) using different decoding algorithms demonstrated that: 1) CR is effective, i.e., it can noticeably improve the classification accuracy; 2) CR is robust, i.e., it consistently outperforms existing data augmentation approaches in the literature; and, 3) CR is flexible, i.e., it can be combined with other data augmentation approaches to further increase the performance. We suggest that data augmentation approaches like CR should be an essential step in EEG-based BCIs. Our code is available online.

* Neural Networks, 176:106351, 2024

Via

Access Paper or Ask Questions

Gated Parametric Neuron for Spike-based Audio Recognition

Dec 02, 2024

Haoran Wang, Herui Zhang, Siyang Li, Dongrui Wu

Abstract:Spiking neural networks (SNNs) aim to simulate real neural networks in the human brain with biologically plausible neurons. The leaky integrate-and-fire (LIF) neuron is one of the most widely studied SNN architectures. However, it has the vanishing gradient problem when trained with backpropagation. Additionally, its neuronal parameters are often manually specified and fixed, in contrast to the heterogeneity of real neurons in the human brain. This paper proposes a gated parametric neuron (GPN) to process spatio-temporal information effectively with the gating mechanism. Compared with the LIF neuron, the GPN has two distinguishing advantages: 1) it copes well with the vanishing gradients by improving the flow of gradient propagation; and, 2) it learns spatio-temporal heterogeneous neuronal parameters automatically. Additionally, we use the same gate structure to eliminate initial neuronal parameter selection and design a hybrid recurrent neural network-SNN structure. Experiments on two spike-based audio datasets demonstrated that the GPN network outperformed several state-of-the-art SNNs, could mitigate vanishing gradients, and had spatio-temporal heterogeneous parameters. Our work shows the ability of SNNs to handle long-term dependencies and achieve high performance simultaneously.

* NeuroComputing, 609:128477, 2024

Via

Access Paper or Ask Questions

Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Dec 02, 2024

Tianwang Jia, Lubin Meng, Siyang Li, Jiajing Liu, Dongrui Wu

Figure 1 for Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Figure 2 for Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Figure 3 for Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Figure 4 for Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Abstract:Training an accurate classifier for EEG-based brain-computer interface (BCI) requires EEG data from a large number of users, whereas protecting their data privacy is a critical consideration. Federated learning (FL) is a promising solution to this challenge. This paper proposes Federated classification with local Batch-specific batch normalization and Sharpness-aware minimization (FedBS) for privacy protection in EEG-based motor imagery (MI) classification. FedBS utilizes local batch-specific batch normalization to reduce data discrepancies among different clients, and sharpness-aware minimization optimizer in local training to improve model generalization. Experiments on three public MI datasets using three popular deep learning models demonstrated that FedBS outperformed six state-of-the-art FL approaches. Remarkably, it also outperformed centralized training, which does not consider privacy protection at all. In summary, FedBS protects user EEG data privacy, enabling multiple BCI users to participate in large-scale machine learning model training, which in turn improves the BCI decoding accuracy.

* IEEE Trans. on Neural Systems and Rehabilitation Engineering, 32:3442-3451, 2024

Via

Access Paper or Ask Questions

Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting

Oct 03, 2024

Siyang Li, Yize Chen, Hui Xiong

Figure 1 for Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting

Figure 2 for Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting

Figure 3 for Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting

Figure 4 for Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting

Abstract:Forecasting faithful trajectories of multivariate time series from practical scopes is essential for reasonable decision-making. Recent methods majorly tailor generative conditional diffusion models to estimate the target temporal predictive distribution. However, it remains an obstacle to enhance the exploitation efficiency of given implicit temporal predictive information to bolster conditional diffusion learning. To this end, we propose a generic channel-aware Contrastive Conditional Diffusion model entitled CCDM to achieve desirable Multivariate probabilistic forecasting, obviating the need for curated temporal conditioning inductive biases. In detail, we first design a channel-centric conditional denoising network to manage intra-variate variations and cross-variate correlations, which can lead to scalability on diverse prediction horizons and channel numbers. Then, we devise an ad-hoc denoising-based temporal contrastive learning to explicitly amplify the predictive mutual information between past observations and future forecasts. It can coherently complement naive step-wise denoising diffusion training and improve the forecasting accuracy and generality on unknown test time series. Besides, we offer theoretic insights on the benefits of such auxiliary contrastive training refinement from both neural mutual information and temporal distribution generalization aspects. The proposed CCDM can exhibit superior forecasting capability compared to current state-of-the-art diffusion forecasters over a comprehensive benchmark, with best MSE and CRPS outcomes on $66.67\%$ and $83.33\%$ cases. Our code is publicly available at https://github.com/LSY-Cython/CCDM.

Via

Access Paper or Ask Questions