Abstract:Human activity recognition (HAR) using radio frequency (RF) signals has garnered considerable attention for its applications in smart environments. However, traditional systems often struggle with limited independent channels between transmitters and receivers, multipath fading, and environmental noise, which particularly degrades performance in through-the-wall scenarios. In this paper, we present a transmissive reconfigurable intelligent surface (TRIS)-assisted through-the-wall human activity recognition (TRIS-HAR) system. The system employs TRIS technology to actively reshape wireless signal propagation, creating multiple independent paths to enhance signal clarity and improve recognition accuracy in complex indoor settings. Additionally, we propose the Human intelligence Mamba (HiMamba), an advanced state space model that captures temporal and frequency-based information for precise activity recognition. HiMamba achieves state-of-the-art performance on two public datasets, demonstrating superior accuracy. Extensive experiments indicate that the TRIS-HAR system improves recognition performance from 85.00% to 98.06% in laboratory conditions and maintains high performance across various environments. This approach offers a robust solution for enhancing RF-based HAR, with promising applications in smart home and elderly care systems.
Abstract:Gait recognition with radio frequency (RF) signals enables many potential applications requiring accurate identification. However, current systems require individuals to be within a line-of-sight (LOS) environment and struggle with low signal-to-noise ratio (SNR) when signals traverse concrete and thick walls. To address these challenges, we present TRGR, a novel transmissive reconfigurable intelligent surface (RIS)-aided gait recognition system. TRGR can recognize human identities through walls using only the magnitude measurements of channel state information (CSI) from a pair of transceivers. Specifically, by leveraging transmissive RIS alongside a configuration alternating optimization algorithm, TRGR enhances wall penetration and signal quality, enabling accurate gait recognition. Furthermore, a residual convolution network (RCNN) is proposed as the backbone network to learn robust human information. Experimental results confirm the efficacy of transmissive RIS, highlighting the significant potential of transmissive RIS in enhancing RF-based gait recognition systems. Extensive experiment results show that TRGR achieves an average accuracy of 97.88\% in identifying persons when signals traverse concrete walls, demonstrating the effectiveness and robustness of TRGR.
Abstract:In dynamic autonomous driving environment, Artificial Intelligence-Generated Content (AIGC) technology can supplement vehicle perception and decision making by leveraging models' generative and predictive capabilities, and has the potential to enhance motion planning, trajectory prediction and traffic simulation. This article proposes a cloud-edge-terminal collaborative architecture to support AIGC for autonomous driving. By delving into the unique properties of AIGC services, this article initiates the attempts to construct mutually supportive AIGC and network systems for autonomous driving, including communication, storage and computation resource allocation schemes to support AIGC services, and leveraging AIGC to assist system design and resource management.
Abstract:Assessment of the glomerular basement membrane (GBM) in transmission electron microscopy (TEM) is crucial for diagnosing chronic kidney disease (CKD). The lack of domain-independent automatic segmentation tools for the GBM necessitates an AI-based solution to automate the process. In this study, we introduce GBMSeg, a training-free framework designed to automatically segment the GBM in TEM images guided only by a one-shot annotated reference. Specifically, GBMSeg first exploits the robust feature matching capabilities of the pretrained foundation model to generate initial prompt points, then introduces a series of novel automatic prompt engineering techniques across the feature and physical space to optimize the prompt scheme. Finally, GBMSeg employs a class-agnostic foundation segmentation model with the generated prompt scheme to obtain accurate segmentation results. Experimental results on our collected 2538 TEM images confirm that GBMSeg achieves superior segmentation performance with a Dice similarity coefficient (DSC) of 87.27% using only one labeled reference image in a training-free manner, outperforming recently proposed one-shot or few-shot methods. In summary, GBMSeg introduces a distinctive automatic prompt framework that facilitates robust domain-independent segmentation performance without training, particularly advancing the automatic prompting of foundation segmentation models for medical images. Future work involves automating the thickness measurement of segmented GBM and quantifying pathological indicators, holding significant potential for advancing pathology assessments in clinical applications. The source code is available on https://github.com/SnowRain510/GBMSeg
Abstract:This paper investigates a Stacked Intelligent Metasurfaces (SIM)-assisted Integrated Sensing and Communications (ISAC) system. An extended target model is considered, where the BS aims to estimate the complete target response matrix relative to the SIM. Under the constraints of minimum Signal-to-Interference-plus-Noise Ratio (SINR) for the communication users (CUs) and maximum transmit power, we jointly optimize the transmit beamforming at the base station (BS) and the end-to-end transmission matrix of the SIM, to minimize the Cram\'er-Rao Bound (CRB) for target estimation. Effective algorithms such as the alternating optimization (AO) and semidefinite relaxation (SDR) are employed to solve the non-convex SINR-constrained CRB minimization problem. Finally, we design and build an experimental platform for SIM, and evaluate the performance of the proposed algorithms for communication and sensing tasks.
Abstract:Object detection and localization are crucial tasks for biomedical image analysis, particularly in the field of hematology where the detection and recognition of blood cells are essential for diagnosis and treatment decisions. While attention-based methods have shown significant progress in object detection in various domains, their application in medical object detection has been limited due to the unique challenges posed by medical imaging datasets. To address this issue, we propose ADA-YOLO, a light-weight yet effective method for medical object detection that integrates attention-based mechanisms with the YOLOv8 architecture. Our proposed method leverages the dynamic feature localisation and parallel regression for computer vision tasks through \textit{adaptive head} module. Empirical experiments were conducted on the Blood Cell Count and Detection (BCCD) dataset to evaluate the effectiveness of ADA-YOLO. The results showed that ADA-YOLO outperforms the YOLOv8 model in mAP (mean average precision) on the BCCD dataset by using more than 3 times less space than YOLOv8. This indicates that our proposed method is effective. Moreover, the light-weight nature of our proposed method makes it suitable for deployment in resource-constrained environments such as mobile devices or edge computing systems. which could ultimately lead to improved diagnosis and treatment outcomes in the field of hematology.
Abstract:In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications.
Abstract:As global attention on renewable and clean energy grows, the research and implementation of microgrids become paramount. This paper delves into the methodology of exploring the relationship between the operational and environmental costs of microgrids through multi-objective optimization models. By integrating various optimization algorithms like Genetic Algorithm, Simulated Annealing, Ant Colony Optimization, and Particle Swarm Optimization, we propose an integrated approach for microgrid optimization. Simulation results depict that these algorithms provide different dispatch results under economic and environmental dispatch, revealing distinct roles of diesel generators and micro gas turbines in microgrids. Overall, this study offers in-depth insights and practical guidance for microgrid design and operation.
Abstract:This paper introduces an innovative adaptive hybrid model for stock market predictions, leveraging the capabilities of an enhanced Variational Mode Decomposition (VMD), Feature Engineering (FE), and stacked Informer integrated with an adaptive loss function. Through rigorous experimentation, the proposed model, termed Adam+GC+enhanced informer (We name it VMGCformer), demonstrates significant proficiency in addressing the intricate dynamics and volatile nature of stock market data. Experimental results, derived from multiple benchmark datasets, underscore the model's superiority in terms of prediction accuracy, responsiveness, and generalization capabilities over traditional and other hybrid models. The research further highlights potential avenues for optimization and introduces future directions to enhance predictive modeling, especially for small enterprises and feature engineering.
Abstract:The sixth generation (6G) of mobile communication system is witnessing a new paradigm shift, i.e., integrated sensing-communication system. A comprehensive dataset is a prerequisite for 6G integrated sensing-communication research. This paper develops a novel simulation dataset, named M3SC, for mixed multi-modal (MMM) sensing-communication integration, and the generation framework of the M3SC dataset is further given. To obtain multi-modal sensory data in physical space and communication data in electromagnetic space, we utilize AirSim and WaveFarer to collect multi-modal sensory data and exploit Wireless InSite to collect communication data. Furthermore, the in-depth integration and precise alignment of AirSim, WaveFarer, and Wireless InSite are achieved. The M3SC dataset covers various weather conditions, various frequency bands, and different times of the day. Currently, the M3SC dataset contains 1500 snapshots, including 80 RGB images, 160 depth maps, 80 LiDAR point clouds, 256 sets of mmWave waveforms with 8 radar point clouds, and 72 channel impulse response (CIR) matrices per snapshot, thus totaling 120,000 RGB images, 240,000 depth maps, 120,000 LiDAR point clouds, 384,000 sets of mmWave waveforms with 12,000 radar point clouds, and 108,000 CIR matrices. The data processing result presents the multi-modal sensory information and communication channel statistical properties. Finally, the MMM sensing-communication application, which can be supported by the M3SC dataset, is discussed.