Abstract:At-home rehabilitation for post-stroke patients presents significant challenges, as continuous, personalized care is often limited outside clinical settings. Additionally, the absence of comprehensive solutions addressing diverse rehabilitation needs in home environments complicates recovery efforts. Here, we introduce a smart home platform that integrates wearable sensors, ambient monitoring, and large language model (LLM)-powered assistance to provide seamless health monitoring and intelligent support. The system leverages machine learning enabled plantar pressure arrays for motor recovery assessment (94% classification accuracy), a wearable eye-tracking module for cognitive evaluation, and ambient sensors for precise smart home control (100% operational success, <1 s latency). Additionally, the LLM-powered agent, Auto-Care, offers real-time interventions, such as health reminders and environmental adjustments, enhancing user satisfaction by 29%. This work establishes a fully integrated platform for long-term, personalized rehabilitation, offering new possibilities for managing chronic conditions and supporting aging populations.
Abstract:Wearable silent speech systems hold significant potential for restoring communication in patients with speech impairments. However, seamless, coherent speech remains elusive, and clinical efficacy is still unproven. Here, we present an AI-driven intelligent throat (IT) system that integrates throat muscle vibrations and carotid pulse signal sensors with large language model (LLM) processing to enable fluent, emotionally expressive communication. The system utilizes ultrasensitive textile strain sensors to capture high-quality signals from the neck area and supports token-level processing for real-time, continuous speech decoding, enabling seamless, delay-free communication. In tests with five stroke patients with dysarthria, IT's LLM agents intelligently corrected token errors and enriched sentence-level emotional and logical coherence, achieving low error rates (4.2% word error rate, 2.9% sentence error rate) and a 55% increase in user satisfaction. This work establishes a portable, intuitive communication platform for patients with dysarthria with the potential to be applied broadly across different neurological conditions and in multi-language support systems.
Abstract:In the evolving application of medical artificial intelligence, federated learning is notable for its ability to protect training data privacy. Federated learning facilitates collaborative model development without the need to share local data from healthcare institutions. Yet, the statistical and system heterogeneity among these institutions poses substantial challenges, which affects the effectiveness of federated learning and hampers the exchange of information between clients. To address these issues, we introduce a novel approach, MH-pFLGB, which employs a global bypass strategy to mitigate the reliance on public datasets and navigate the complexities of non-IID data distributions. Our method enhances traditional federated learning by integrating a global bypass model, which would share the information among the clients, but also serves as part of the network to enhance the performance on each client. Additionally, MH-pFLGB provides a feature fusion module to better combine the local and global features. We validate \model{}'s effectiveness and adaptability through extensive testing on different medical tasks, demonstrating superior performance compared to existing state-of-the-art methods.
Abstract:In medical image segmentation, personalized cross-silo federated learning (FL) is becoming popular for utilizing varied data across healthcare settings to overcome data scarcity and privacy concerns. However, existing methods often suffer from client drift, leading to inconsistent performance and delayed training. We propose a new framework, Personalized Federated Learning via Feature Enhancement (pFLFE), designed to mitigate these challenges. pFLFE consists of two main stages: feature enhancement and supervised learning. The first stage improves differentiation between foreground and background features, and the second uses these enhanced features for learning from segmentation masks. We also design an alternative training approach that requires fewer communication rounds without compromising segmentation quality, even with limited communication resources. Through experiments on three medical segmentation tasks, we demonstrate that pFLFE outperforms the state-of-the-art methods.
Abstract:The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches.
Abstract:Artificial nociceptors, mimicking human-like stimuli perception, are of significance for intelligent robotics to work in hazardous and dynamic scenarios. One of the most essential characteristics of the human nociceptor is its self-adjustable attribute, which indicates that the threshold of determination of a potentially hazardous stimulus relies on environmental knowledge. This critical attribute has been currently omitted, but it is highly desired for artificial nociceptors. Inspired by these shortcomings, this article presents, for the first time, a Self-Directed Channel (SDC) memristor-based self-reconfigurable nociceptor, capable of perceiving hazardous pressure stimuli under different temperatures and demonstrates key features of tactile nociceptors, including 'threshold,' 'no-adaptation,' and 'sensitization.' The maximum amplification of hazardous external stimuli is 1000%, and its response characteristics dynamically adapt to current temperature conditions by automatically altering the generated modulation schemes for the memristor. The maximum difference ratio of the response of memristors at different temperatures is 500%, and this adaptability closely mimics the functions of biological tactile nociceptors, resulting in accurate danger perception in various conditions. Beyond temperature adaptation, this memristor-based nociceptor has the potential to integrate different sensory modalities by applying various sensors, thereby achieving human-like perception capabilities in real-world environments.
Abstract:Memristive neuromorphic systems are designed to emulate human perception and cognition, where the memristor states represent essential historical information to perform both low-level and high-level tasks. However, current systems face challenges with the separation of state modulation and acquisition, leading to undesired time delays that impact real-time performance. To overcome this issue, we introduce a dual-function circuit that concurrently modulates and acquires memristor state information. This is achieved through two key features: 1) a feedback operational amplifier (op-amp) based circuit that ensures precise voltage application on the memristor while converting the passing current into a voltage signal; 2) a division calculation circuit that acquires state information from the modulation voltage and the converted voltage, improving stability by leveraging the intrinsic threshold characteristics of memristors. This circuit has been evaluated in a memristor-based nociceptor and a memristor crossbar, demonstrating exceptional performance. For instance, it achieves mean absolute acquisition errors below 1 {\Omega} during the modulation process in the nociceptor application. These results demonstrate that the proposed circuit can operate at different scales, holding the potential to enhance a wide range of neuromorphic applications.
Abstract:Federated learning is widely used in medical applications for training global models without needing local data access. However, varying computational capabilities and network architectures (system heterogeneity), across clients pose significant challenges in effectively aggregating information from non-independently and identically distributed (non-IID) data. Current federated learning methods using knowledge distillation require public datasets, raising privacy and data collection issues. Additionally, these datasets require additional local computing and storage resources, which is a burden for medical institutions with limited hardware conditions. In this paper, we introduce a novel federated learning paradigm, named Model Heterogeneous personalized Federated Learning via Injection and Distillation (MH-pFLID). Our framework leverages a lightweight messenger model that carries concentrated information to collect the information from each client. We also develop a set of receiver and transmitter modules to receive and send information from the messenger model, so that the information could be injected and distilled with efficiency.
Abstract:This report details the development and key achievements of our latest language model designed for custom large language models. The advancements introduced include a novel Online Data Scheduler that supports flexible training data adjustments and curriculum learning. The model's architecture is fortified with state-of-the-art techniques such as Rotary Positional Embeddings, QK-LayerNorm, and a specially crafted multilingual tokenizer to enhance stability and performance. Moreover, our robust training framework incorporates advanced monitoring and rapid recovery features to ensure optimal efficiency. Our Wonton 7B model has demonstrated competitive performance on a range of multilingual and English benchmarks. Future developments will prioritize narrowing the performance gap with more extensively trained models, thereby enhancing the model's real-world efficacy and adaptability.GitHub: \url{https://github.com/nyonicai/nyonic-public}
Abstract:Model-based reinforcement learning (MBRL) attempts to use an available or a learned model to improve the data efficiency of reinforcement learning. This work proposes a one-step lookback approach that jointly learns the latent-space model and the policy to realize the sample-efficient continuous robotic control, wherein the control-theoretical knowledge is utilized to decrease the model learning difficulty. Specifically, the so-called one-step backward data is utilized to facilitate the incremental evolution model, an alternative structured representation of the robotics evolution model in the MBRL field. The incremental evolution model accurately predicts the robotics movement but with low sample complexity. This is because the formulated incremental evolution model degrades the model learning difficulty into a parametric matrix learning problem, which is especially favourable to high-dimensional robotics applications. The imagined data from the learned incremental evolution model is used to supplement training data to enhance the sample efficiency. Comparative numerical simulations on benchmark continuous robotics control problems are conducted to validate the efficiency of our proposed one-step lookback approach.