Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Soowon Kim

Lightweight Diffusion-based Framework for Online Imagined Speech Decoding in Aphasia

Nov 11, 2025

Eunyeong Ko, Soowon Kim, Ha-Na Jo

Figure 1 for Lightweight Diffusion-based Framework for Online Imagined Speech Decoding in Aphasia

Figure 2 for Lightweight Diffusion-based Framework for Online Imagined Speech Decoding in Aphasia

Abstract:A diffusion-based neural decoding framework optimized for real-time imagined speech classification in individuals with aphasia. The system integrates a lightweight conditional diffusion encoder and convolutional classifier trained using subject-specific EEG data acquired from a Korean-language paradigm. A dual-criterion early stopping strategy enabled rapid convergence under limited calibration data, while dropout regularization and grouped temporal convolutions ensured stable generalization. During online operation, continuous EEG streams were processed in two-second sliding windows to generate class probabilities that dynamically modulated visual and auditory feedback according to decoding confidence. Across twenty real-time trials, the framework achieved 65% top-1 and 70% top-2 accuracy, outperforming offline evaluation (50% top-1). These results demonstrate the feasibility of deploying diffusion-based EEG decoding under practical clinical constraints, maintaining reliable performance despite environmental variability and minimal preprocessing. The proposed framework advances the translation of imagined speech brain-computer interfaces toward clinical communication support for individuals with severe expressive language impairment.

* 4 pages, 2 figures, 1 table, Name of Conference: International Conference on Brain-Computer Interface

Via

Access Paper or Ask Questions

Subject-Independent Imagined Speech Detection via Cross-Subject Generalization and Calibration

Nov 11, 2025

Byung-Kwan Ko, Soowon Kim, Seo-Hyun Lee

Abstract:Achieving robust generalization across individuals remains a major challenge in electroencephalogram based imagined speech decoding due to substantial variability in neural activity patterns. This study examined how training dynamics and lightweight subject specific adaptation influence cross subject performance in a neural decoding framework. A cyclic inter subject training approach, involving shorter per subject training segments and frequent alternation among subjects, led to modest yet consistent improvements in decoding performance across unseen target data. Furthermore, under the subject calibrated leave one subject out scheme, incorporating only 10 % of the target subjects data for calibration achieved an accuracy of 0.781 and an AUC of 0.801, demonstrating the effectiveness of few shot adaptation. These findings suggest that integrating cyclic training with minimal calibration provides a simple and effective strategy for developing scalable, user adaptive brain computer interface systems that balance generalization and personalization.

* 4 pages, 2 figures, Name of Conference: International Conference on Brain-Computer Interface

Via

Access Paper or Ask Questions

Confidence-Aware Neural Decoding of Overt Speech from EEG: Toward Robust Brain-Computer Interfaces

Nov 11, 2025

Soowon Kim, Byung-Kwan Ko, Seo-Hyun Lee

Figure 1 for Confidence-Aware Neural Decoding of Overt Speech from EEG: Toward Robust Brain-Computer Interfaces

Figure 2 for Confidence-Aware Neural Decoding of Overt Speech from EEG: Toward Robust Brain-Computer Interfaces

Abstract:Non-invasive brain-computer interfaces that decode spoken commands from electroencephalogram must be both accurate and trustworthy. We present a confidence-aware decoding framework that couples deep ensembles of compact, speech-oriented convolutional networks with post-hoc calibration and selective classification. Uncertainty is quantified using ensemble-based predictive entropy, top-two margin, and mutual information, and decisions are made with an abstain option governed by an accuracy-coverage operating point. The approach is evaluated on a multi-class overt speech dataset using a leakage-safe, block-stratified split that respects temporal contiguity. Compared with widely used baselines, the proposed method yields more reliable probability estimates, improved selective performance across operating points, and balanced per-class acceptance. These results suggest that confidence-aware neural decoding can provide robust, deployment-oriented behavior for real-world brain-computer interface communication systems.

Via

Access Paper or Ask Questions

EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models

Nov 14, 2024

Soowon Kim, Ha-Na Jo, Eunyeong Ko

Figure 1 for EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models

Figure 2 for EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models

Figure 3 for EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models

Abstract:In this study, we propose an ensemble learning framework for electroencephalogram-based overt speech classification, leveraging denoising diffusion probabilistic models with varying convolutional kernel sizes. The ensemble comprises three models with kernel sizes of 51, 101, and 201, effectively capturing multi-scale temporal features inherent in signals. This approach improves the robustness and accuracy of speech decoding by accommodating the rich temporal complexity of neural signals. The ensemble models work in conjunction with conditional autoencoders that refine the reconstructed signals and maximize the useful information for downstream classification tasks. The results indicate that the proposed ensemble-based approach significantly outperforms individual models and existing state-of-the-art techniques. These findings demonstrate the potential of ensemble methods in advancing brain signal decoding, offering new possibilities for non-verbal communication applications, particularly in brain-computer interface systems aimed at aiding individuals with speech impairments.

Via

Access Paper or Ask Questions

Dynamic Neural Communication: Convergence of Computer Vision and Brain-Computer Interface

Nov 14, 2024

Ji-Ha Park, Seo-Hyun Lee, Soowon Kim, Seong-Whan Lee

Figure 1 for Dynamic Neural Communication: Convergence of Computer Vision and Brain-Computer Interface

Figure 2 for Dynamic Neural Communication: Convergence of Computer Vision and Brain-Computer Interface

Figure 3 for Dynamic Neural Communication: Convergence of Computer Vision and Brain-Computer Interface

Abstract:Interpreting human neural signals to decode static speech intentions such as text or images and dynamic speech intentions such as audio or video is showing great potential as an innovative communication tool. Human communication accompanies various features, such as articulatory movements, facial expressions, and internal speech, all of which are reflected in neural signals. However, most studies only generate short or fragmented outputs, while providing informative communication by leveraging various features from neural signals remains challenging. In this study, we introduce a dynamic neural communication method that leverages current computer vision and brain-computer interface technologies. Our approach captures the user's intentions from neural signals and decodes visemes in short time steps to produce dynamic visual outputs. The results demonstrate the potential to rapidly capture and reconstruct lip movements during natural speech attempts from human neural signals, enabling dynamic neural communication through the convergence of computer vision and brain--computer interface.

* 4 pages, 2 figures, 1 table, Name of Conference: International Conference on Brain-Computer Interface

Via

Access Paper or Ask Questions

Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

Dec 10, 2023

Seo-Hyun Lee, Young-Eun Lee, Soowon Kim, Byung-Kwan Ko, Jun-Young Kim, Seong-Whan Lee

Figure 1 for Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

Figure 2 for Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

Figure 3 for Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

Abstract:Brain-to-speech technology represents a fusion of interdisciplinary applications encompassing fields of artificial intelligence, brain-computer interfaces, and speech synthesis. Neural representation learning based intention decoding and speech synthesis directly connects the neural activity to the means of human linguistic communication, which may greatly enhance the naturalness of communication. With the current discoveries on representation learning and the development of the speech synthesis technologies, direct translation of brain signals into speech has shown great promise. Especially, the processed input features and neural speech embeddings which are given to the neural network play a significant role in the overall performance when using deep generative models for speech generation from brain signals. In this paper, we introduce the current brain-to-speech technology with the possibility of speech synthesis from brain signals, which may ultimately facilitate innovation in non-verbal communication. Also, we perform comprehensive analysis on the neural features and neural speech embeddings underlying the neurophysiological activation while performing speech, which may play a significant role in the speech synthesis works.

* 4 pages

Via

Access Paper or Ask Questions

Brain-Driven Representation Learning Based on Diffusion Model

Nov 14, 2023

Soowon Kim, Seo-Hyun Lee, Young-Eun Lee, Ji-Won Lee, Ji-Ha Park, Seong-Whan Lee

Abstract:Interpreting EEG signals linked to spoken language presents a complex challenge, given the data's intricate temporal and spatial attributes, as well as the various noise factors. Denoising diffusion probabilistic models (DDPMs), which have recently gained prominence in diverse areas for their capabilities in representation learning, are explored in our research as a means to address this issue. Using DDPMs in conjunction with a conditional autoencoder, our new approach considerably outperforms traditional machine learning algorithms and established baseline models in accuracy. Our results highlight the potential of DDPMs as a sophisticated computational method for the analysis of speech-related EEG signals. This could lead to significant advances in brain-computer interfaces tailored for spoken communication.

Via

Access Paper or Ask Questions

Enhanced Generative Adversarial Networks for Unseen Word Generation from EEG Signals

Nov 14, 2023

Young-Eun Lee, Seo-Hyun Lee, Soowon Kim, Jung-Sun Lee, Deok-Seon Kim, Seong-Whan Lee

Abstract:Recent advances in brain-computer interface (BCI) technology, particularly based on generative adversarial networks (GAN), have shown great promise for improving decoding performance for BCI. Within the realm of Brain-Computer Interfaces (BCI), GANs find application in addressing many areas. They serve as a valuable tool for data augmentation, which can solve the challenge of limited data availability, and synthesis, effectively expanding the dataset and creating novel data formats, thus enhancing the robustness and adaptability of BCI systems. Research in speech-related paradigms has significantly expanded, with a critical impact on the advancement of assistive technologies and communication support for individuals with speech impairments. In this study, GANs were investigated, particularly for the BCI field, and applied to generate text from EEG signals. The GANs could generalize all subjects and decode unseen words, indicating its ability to capture underlying speech patterns consistent across different individuals. The method has practical applications in neural signal-based speech recognition systems and communication aids for individuals with speech difficulties.

* 5 pages, 2 figures

Via

Access Paper or Ask Questions

Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG

Jul 26, 2023

Soowon Kim, Young-Eun Lee, Seo-Hyun Lee, Seong-Whan Lee

Figure 1 for Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG

Figure 2 for Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG

Figure 3 for Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG

Abstract:Decoding EEG signals for imagined speech is a challenging task due to the high-dimensional nature of the data and low signal-to-noise ratio. In recent years, denoising diffusion probabilistic models (DDPMs) have emerged as promising approaches for representation learning in various domains. Our study proposes a novel method for decoding EEG signals for imagined speech using DDPMs and a conditional autoencoder named Diff-E. Results indicate that Diff-E significantly improves the accuracy of decoding EEG signals for imagined speech compared to traditional machine learning techniques and baseline models. Our findings suggest that DDPMs can be an effective tool for EEG signal decoding, with potential implications for the development of brain-computer interfaces that enable communication through imagined speech.

* Accepted to Interspeech 2023

Via

Access Paper or Ask Questions

Subject-Independent Classification of Brain Signals using Skip Connections

Jan 19, 2023

Soowon Kim, Ji-Won Lee, Young-Eun Lee, Seo-Hyun Lee

Figure 1 for Subject-Independent Classification of Brain Signals using Skip Connections

Figure 2 for Subject-Independent Classification of Brain Signals using Skip Connections

Figure 3 for Subject-Independent Classification of Brain Signals using Skip Connections

Abstract:Untapped potential for new forms of human-to-human communication can be found in the active research field of studies on the decoding of brain signals of human speech. A brain-computer interface system can be implemented using electroencephalogram signals because it poses more less clinical risk and can be acquired using portable instruments. One of the most interesting tasks for the brain-computer interface system is decoding words from the raw electroencephalogram signals. Before a brain-computer interface may be used by a new user, current electroencephalogram-based brain-computer interface research typically necessitates a subject-specific adaption stage. In contrast, the subject-independent situation is one that is highly desired since it allows a well-trained model to be applied to new users with little or no precalibration. The emphasis is on creating an efficient decoder that may be employed adaptively in subject-independent circumstances in light of this crucial characteristic. Our proposal is to explicitly apply skip connections between convolutional layers to enable the flow of mutual information between layers. To do this, we add skip connections between layers, allowing the mutual information to flow throughout the layers. The output of the encoder is then passed through the fully-connected layer to finally represent the probabilities of the 13 classes. In this study, overt speech was used to record the electroencephalogram data of 16 participants. The results show that when the skip connection is present, the classification performance improves notably.

* 4 pages, 3 figures, IEEE BCI winter conference 2023 accepted

Via

Access Paper or Ask Questions