Abstract:Semantic communications are considered a promising beyond-Shannon/bit paradigm to reduce network traffic and increase reliability, thus making wireless networks more energy efficient, robust, and sustainable. However, the performance is limited by the efficiency of the semantic transceivers, i.e., the achievable "similarity" between the transmitted and received signals. Under strict similarity conditions, semantic transmission may not be applicable and bit communication is mandatory. In this paper, for the first time in the literature, we propose a multi-carrier Hybrid Semantic-Shannon communication system where, without loss of generality, the case of text transmission is investigated. To this end, a joint semantic-bit transmission selection and power allocation optimization problem is formulated, aiming to minimize two transmission delay metrics widely used in the literature, subject to strict similarity thresholds. Despite their non-convexity, both problems are decomposed into a convex and a mixed linear integer programming problem by using alternating optimization, both of which can be solved optimally. Furthermore, to improve the performance of the proposed hybrid schemes, a novel association of text sentences to subcarriers is proposed based on the data size of the sentences and the channel gains of the subcarriers. We show that the proposed association is optimal in terms of transmission delay. Numerical simulations verify the effectiveness of the proposed hybrid semantic-bit communication scheme and the derived sentence-to-subcarrier association, and provide useful insights into the design parameters of such systems.
Abstract:The time-consuming nature of training and deploying complicated Machine and Deep Learning (DL) models for a variety of applications continues to pose significant challenges in the field of Machine Learning (ML). These challenges are particularly pronounced in the federated domain, where optimizing models for individual nodes poses significant difficulty. Many methods have been developed to tackle this problem, aiming to reduce training expenses and time while maintaining efficient optimisation. Three suggested strategies to tackle this challenge include Active Learning, Knowledge Distillation, and Local Memorization. These methods enable the adoption of smaller models that require fewer computational resources and allow for model personalization with local insights, thereby improving the effectiveness of current models. The present study delves into the fundamental principles of these three approaches and proposes an advanced Federated Learning System that utilises different Personalisation methods towards improving the accuracy of AI models and enhancing user experience in real-time NG-IoT applications, investigating the efficacy of these techniques in the local and federated domain. The results of the original and optimised models are then compared in both local and federated contexts using a comparison analysis. The post-analysis shows encouraging outcomes when it comes to optimising and personalising the models with the suggested techniques.
Abstract:Over the recent years, the protection of the so-called `soft-targets', i.e. locations easily accessible by the general public with relatively low, though, security measures, has emerged as a rather challenging and increasingly important issue. The complexity and seriousness of this security threat growths nowadays exponentially, due to the emergence of new advanced technologies (e.g. Artificial Intelligence (AI), Autonomous Vehicles (AVs), 3D printing, etc.); especially when it comes to large-scale, popular and diverse public spaces. In this paper, a novel Digital Twin-as-a-Security-Service (DTaaSS) architecture is introduced for holistically and significantly enhancing the protection of public spaces (e.g. metro stations, leisure sites, urban squares, etc.). The proposed framework combines a Digital Twin (DT) conceptualization with additional cutting-edge technologies, including Internet of Things (IoT), cloud computing, Big Data analytics and AI. In particular, DTaaSS comprises a holistic, real-time, large-scale, comprehensive and data-driven security solution for the efficient/robust protection of public spaces, supporting: a) data collection and analytics, b) area monitoring/control and proactive threat detection, c) incident/attack prediction, and d) quantitative and data-driven vulnerability assessment. Overall, the designed architecture exhibits increased potential in handling complex, hybrid and combined threats over large, critical and popular soft-targets. The applicability and robustness of DTaaSS is discussed in detail against representative and diverse real-world application scenarios, including complex attacks to: a) a metro station, b) a leisure site, and c) a cathedral square.
Abstract:Current methods for low- and few-shot object detection have primarily focused on enhancing model performance for detecting objects. One common approach to achieve this is by combining model finetuning with data augmentation strategies. However, little attention has been given to the energy efficiency of these approaches in data-scarce regimes. This paper seeks to conduct a comprehensive empirical study that examines both model performance and energy efficiency of custom data augmentations and automated data augmentation selection strategies when combined with a lightweight object detector. The methods are evaluated in three different benchmark datasets in terms of their performance and energy consumption, and the Efficiency Factor is employed to gain insights into their effectiveness considering both performance and efficiency. Consequently, it is shown that in many cases, the performance gains of data augmentation strategies are overshadowed by their increased energy usage, necessitating the development of more energy efficient data augmentation strategies to address data scarcity.
Abstract:Image data augmentation constitutes a critical methodology in modern computer vision tasks, since it can facilitate towards enhancing the diversity and quality of training datasets; thereby, improving the performance and robustness of machine learning models in downstream tasks. In parallel, augmentation approaches can also be used for editing/modifying a given image in a context- and semantics-aware way. Diffusion Models (DMs), which comprise one of the most recent and highly promising classes of methods in the field of generative Artificial Intelligence (AI), have emerged as a powerful tool for image data augmentation, capable of generating realistic and diverse images by learning the underlying data distribution. The current study realizes a systematic, comprehensive and in-depth review of DM-based approaches for image augmentation, covering a wide range of strategies, tasks and applications. In particular, a comprehensive analysis of the fundamental principles, model architectures and training strategies of DMs is initially performed. Subsequently, a taxonomy of the relevant image augmentation methods is introduced, focusing on techniques regarding semantic manipulation, personalization and adaptation, and application-specific augmentation tasks. Then, performance assessment methodologies and respective evaluation metrics are analyzed. Finally, current challenges and future research directions in the field are discussed.
Abstract:In response to the increasing number of devices anticipated in next-generation networks, a shift toward over-the-air (OTA) computing has been proposed. Leveraging the superposition of multiple access channels, OTA computing enables efficient resource management by supporting simultaneous uncoded transmission in the time and the frequency domain. Thus, to advance the integration of OTA computing, our study presents a theoretical analysis addressing practical issues encountered in current digital communication transceivers, such as time sampling error and intersymbol interference (ISI). To this end, we examine the theoretical mean squared error (MSE) for OTA transmission under time sampling error and ISI, while also exploring methods for minimizing the MSE in the OTA transmission. Utilizing alternating optimization, we also derive optimal power policies for both the devices and the base station. Additionally, we propose a novel deep neural network (DNN)-based approach to design waveforms enhancing OTA transmission performance under time sampling error and ISI. To ensure fair comparison with existing waveforms like the raised cosine (RC) and the better-than-raised-cosine (BRTC), we incorporate a custom loss function integrating energy and bandwidth constraints, along with practical design considerations such as waveform symmetry. Simulation results validate our theoretical analysis and demonstrate performance gains of the designed pulse over RC and BTRC waveforms. To facilitate testing of our results without necessitating the DNN structure recreation, we provide curve fitting parameters for select DNN-based waveforms as well.
Abstract:The increased availability of medical data has significantly impacted healthcare by enabling the application of machine / deep learning approaches in various instances. However, medical datasets are usually small and scattered across multiple providers, suffer from high class-imbalance, and are subject to stringent data privacy constraints. In this paper, the application of a data regularization algorithm, suitable for learning under high class-imbalance, in a federated learning setting is proposed. Specifically, the goal of the proposed method is to enhance model performance for cardiovascular disease prediction by tackling the class-imbalance that typically characterizes datasets used for this purpose, as well as by leveraging patient data available in different nodes of a federated ecosystem without compromising their privacy and enabling more resource sensitive allocation. The method is evaluated across four datasets for cardiovascular disease prediction, which are scattered across different clients, achieving improved performance. Meanwhile, its robustness under various hyperparameter settings, as well as its ability to adapt to different resource allocation scenarios, is verified.
Abstract:Federated learning (FL) is a decentralized learning technique that enables participating devices to collaboratively build a shared Machine Leaning (ML) or Deep Learning (DL) model without revealing their raw data to a third party. Due to its privacy-preserving nature, FL has sparked widespread attention for building Intrusion Detection Systems (IDS) within the realm of cybersecurity. However, the data heterogeneity across participating domains and entities presents significant challenges for the reliable implementation of an FL-based IDS. In this paper, we propose an effective method called Statistical Averaging (StatAvg) to alleviate non-independently and identically (non-iid) distributed features across local clients' data in FL. In particular, StatAvg allows the FL clients to share their individual data statistics with the server, which then aggregates this information to produce global statistics. The latter are shared with the clients and used for universal data normalisation. It is worth mentioning that StatAvg can seamlessly integrate with any FL aggregation strategy, as it occurs before the actual FL training process. The proposed method is evaluated against baseline approaches using datasets for network and host Artificial Intelligence (AI)-powered IDS. The experimental results demonstrate the efficiency of StatAvg in mitigating non-iid feature distributions across the FL clients compared to the baseline methods.
Abstract:In the evolving landscape of sixth-generation (6G) wireless networks, which demand ultra high data rates, this study introduces the concept of super constellation communications. Also, we present super amplitude phase shift keying (SAPSK), an innovative modulation technique designed to achieve these ultra high data rate demands. SAPSK is complemented by the generalized polar distance detector (GPD-D), which approximates the optimal maximum likelihood detector in channels with Gaussian phase noise (GPN). By leveraging the decision regions formulated by GPD-D, a tight closed-form approximation for the symbol error probability (SEP) of SAPSK constellations is derived, while a detection algorithm with O(1) time complexity is developed to ensure fast and efficient SAPSK symbol detection. Finally, the theoretical performance of SAPSK and the efficiency of the proposed O(1) algorithm are validated by numerical simulations, highlighting both its superiority in terms of SEP compared to various constellations and its practical advantages in terms of fast and accurate symbol detection.
Abstract:The escalating volumes of textile waste globally necessitate innovative waste management solutions to mitigate the environmental impact and promote sustainability in the fashion industry. This paper addresses the inefficiencies of traditional textile sorting methods by introducing an autonomous textile analysis pipeline. Utilising robotics, spectral imaging, and AI-driven classification, our system enhances the accuracy, efficiency, and scalability of textile sorting processes, contributing to a more sustainable and circular approach to waste management. The integration of a Digital Twin system further allows critical evaluation of technical and economic feasibility, providing valuable insights into the sorting system's accuracy and reliability. The proposed framework, inspired by Industry 4.0 principles, comprises five interconnected layers facilitating seamless data exchange and coordination within the system. Preliminary results highlight the potential of our holistic approach to mitigate environmental impact and foster a positive shift towards recycling in the textile industry.