Abstract:Communication with the goal of accurately conveying meaning, rather than accurately transmitting symbols, has become an area of growing interest. This paradigm, termed semantic communication, typically leverages modern developments in artificial intelligence and machine learning to improve the efficiency and robustness of communication systems. However, a standard model for capturing and quantifying the details of "meaning" is lacking, with many leading approaches to semantic communication adopting a black-box framework with little understanding of what exactly the model is learning. One solution is to utilize the conceptual spaces framework, which models meaning explicitly in a geometric manner. Though prior work studying semantic communication with conceptual spaces has shown promising results, these previous attempts involve hand-crafting a conceptual space model, severely limiting the scalability and practicality of the approach. In this work, we develop a framework for learning a domain of a conceptual space model using only the raw data with high-level property labels. In experiments using the MNIST and CelebA datasets, we show that the domains learned using the framework maintain semantic similarity relations and possess interpretable dimensions.
Abstract:Fault detection and diagnosis of electrical motors are of utmost importance in ensuring the safe and reliable operation of several industrial systems. Detection and diagnosis of faults at the incipient stage allows corrective actions to be taken in order to reduce the severity of faults. The existing data-driven deep learning approaches for machine fault diagnosis rely extensively on huge amounts of labeled samples, where annotations are expensive and time-consuming. However, a major portion of unlabeled condition monitoring data is not exploited in the training process. To overcome this limitation, we propose a foundational model-based Active Learning framework that utilizes less amount of labeled samples, which are most informative and harnesses a large amount of available unlabeled data by effectively combining Active Learning and Contrastive Self-Supervised Learning techniques. It consists of a transformer network-based backbone model trained using an advanced nearest-neighbor contrastive self-supervised learning method. This approach empowers the backbone to learn improved representations of samples derived from raw, unlabeled vibration data. Subsequently, the backbone can undergo fine-tuning to address a range of downstream tasks, both within the same machines and across different machines. The effectiveness of the proposed methodology has been assessed through the fine-tuning of the backbone for multiple target tasks using three distinct machine-bearing fault datasets. The experimental evaluation demonstrates a superior performance as compared to existing state-of-the-art fault diagnosis methods with less amount of labeled data.
Abstract:A majority of recent advancements related to the fault diagnosis of electrical motors are based on the assumption that training and testing data are drawn from the same distribution. However, the data distribution can vary across different operating conditions during real-world operating scenarios of electrical motors. Consequently, this assumption limits the practical implementation of existing studies for fault diagnosis, as they rely on fully labelled training data spanning all operating conditions and assume a consistent distribution. This is because obtaining a large number of labelled samples for several machines across different fault cases and operating scenarios may be unfeasible. In order to overcome the aforementioned limitations, this work proposes a framework to develop a foundational model for fault diagnosis of electrical motors. It involves building a neural network-based backbone to learn high-level features using self-supervised learning, and then fine-tuning the backbone to achieve specific objectives. The primary advantage of such an approach is that the backbone can be fine-tuned to achieve a wide variety of target tasks using very less amount of training data as compared to traditional supervised learning methodologies. The empirical evaluation demonstrates the effectiveness of the proposed approach by obtaining more than 90\% classification accuracy by fine-tuning the backbone not only across different types of fault scenarios or operating conditions, but also across different machines. This illustrates the promising potential of the proposed approach for cross-machine fault diagnosis tasks in real-world applications.
Abstract:As our world grows increasingly connected and new technologies arise, global demands for data traffic continue to rise exponentially. Limited by the fundamental results of information theory, to meet these demands we are forced to either increase power or bandwidth usage. But what if there was a way to use these resources more efficiently? This question is the main driver behind the recent surge of interest in semantic communication, which seeks to leverage increased intelligence to move beyond the Shannon limit of technical communication. In this paper we expound a method of achieving semantic communication which utilizes the conceptual space model of knowledge representation. In contrast to other popular methods of semantic communication, our approach is intuitive, interpretable and efficient. We derive some preliminary results bounding the probability of semantic error under our framework, and show how our approach can serve as the underlying knowledge-driven foundation to higher-level intelligent systems. Taking inspiration from a metaverse application, we perform simulations to draw important insights about the proposed method and demonstrate how it can be used to achieve semantic communication with a 99.9% reduction in rate as compared to a more traditional setup.
Abstract:Uncertainty quantification is a critical yet unsolved challenge for deep learning, especially for the time series imputation with irregularly sampled measurements. To tackle this problem, we propose a novel framework based on the principles of recurrent neural networks and neural stochastic differential equations for reconciling irregularly sampled measurements. We impute measurements at any arbitrary timescale and quantify the uncertainty in the imputations in a principled manner. Specifically, we derive analytical expressions for quantifying and propagating the epistemic and aleatoric uncertainty across time instants. Our experiments on the IEEE 37 bus test distribution system reveal that our framework can outperform state-of-the-art uncertainty quantification approaches for time-series data imputations.
Abstract:Despite the fact that Shannon and Weaver's Mathematical Theory of Communication was published over 70 years ago, all communication systems continue to operate at the first of three levels defined in this theory: the technical level. In this letter, we argue that a transition to the semantic level embodies a natural, important step in the evolution of communication technologies. Furthermore, we propose a novel approach to engineering semantic communication using conceptual spaces and functional compression. We introduce a model of semantic communication utilizing this approach, and present quantitative simulation results demonstrating performance gains on the order of 3dB.
Abstract:As the global demand for data has continued to rise exponentially, some have begun turning to the idea of semantic communication as a means of efficiently meeting this demand. Pushing beyond the boundaries of conventional communication systems, semantic communication focuses on the accurate recovery of the meaning conveyed from source to receiver, as opposed to the accurate recovery of transmitted symbols. In this work, we aim to provide a comprehensive view of the history and current state of semantic communication and the techniques for engineering this higher level of communication. A survey of the current literature reveals four broad approaches to engineering semantic communication. We term the earliest of these approaches classical semantic information, which seeks to extend information-theoretic results to include semantic information. A second approach makes use of knowledge graphs to achieve semantic communication, and a third utilizes the power of modern deep learning techniques to facilitate this communication. The fourth approach focuses on the significance of information, rather than its meaning, to achieve efficient, goal-oriented communication. We discuss each of these four approaches and their corresponding works in detail, and provide some challenges and opportunities that pertain to each approach. Finally, we introduce a novel approach to semantic communication, which we term context-based semantic communication. Inspired by the way in which humans naturally communicate with one another, this context-based approach provides a general, optimization-based design framework for semantic communication systems. Together, this survey provides a useful guide for the design and implementation of semantic communication systems.
Abstract:Despite being the subject of a growing body of research, non-orthogonal multiple access has failed to garner sufficient support to be included in modern standards. One of the more promising approaches to non-orthogonal multiple access is sparse code multiple access, which seeks to utilize non-orthogonal, sparse spreading codes to share bandwidth among users more efficiently than traditional orthogonal methods. Nearly all of the studies regarding sparse code multiple access assume synchronization at the receiver, which may not always be a practical assumption. In this work, we aim to bring this promising technology closer to a practical realization by dropping the assumption of synchronization. We therefore propose a compressed sensing-based delay estimation technique developed specifically for an uplink sparse code multiple access system. The proposed technique can be used with nearly all of the numerous decoding algorithms proposed in the existing literature, including the popular message passing approach. Furthermore, we derive a theoretical bound regarding the recovery performance of the proposed technique, and use simulations to demonstrate its viability in a practical uplink system.
Abstract:Deep reinforcement learning (DRL) has empowered a variety of artificial intelligence fields, including pattern recognition, robotics, recommendation-systems, and gaming. Similarly, graph neural networks (GNN) have also demonstrated their superior performance in supervised learning for graph-structured data. In recent times, the fusion of GNN with DRL for graph-structured environments has attracted a lot of attention. This paper provides a comprehensive review of these hybrid works. These works can be classified into two categories: (1) algorithmic enhancement, where DRL and GNN complement each other for better utility; (2) application-specific enhancement, where DRL and GNN support each other. This fusion effectively addresses various complex problems in engineering and life sciences. Based on the review, we further analyze the applicability and benefits of fusing these two domains, especially in terms of increasing generalizability and reducing computational complexity. Finally, the key challenges in integrating DRL and GNN, and potential future research directions are highlighted, which will be of interest to the broader machine learning community.
Abstract:Influence maximization (IM) is a combinatorial problem of identifying a subset of nodes called the seed nodes in a network (graph), which when activated, provide a maximal spread of influence in the network for a given diffusion model and a budget for seed set size. IM has numerous applications such as viral marketing, epidemic control, sensor placement and other network-related tasks. However, the uses are limited due to the computational complexity of current algorithms. Recently, learning heuristics for IM have been explored to ease the computational burden. However, there are serious limitations in current approaches such as: (1) IM formulations only consider influence via spread and ignore self activation; (2) scalability to large graphs; (3) generalizability across graph families; (4) low computational efficiency with a large running time to identify seed sets for every test network. In this work, we address each of these limitations through a unique approach that involves (1) formulating a generic IM problem as a Markov decision process that handles both intrinsic and influence activations; (2) employing double Q learning to estimate seed nodes; (3) ensuring scalability via sub-graph based representations; and (4) incorporating generalizability via meta-learning across graph families. Extensive experiments are carried out in various standard networks to validate performance of the proposed Graph Meta Reinforcement learning (GraMeR) framework. The results indicate that GraMeR is multiple orders faster and generic than conventional approaches.