Abstract:Environment Sound Classification has been a well-studied research problem in the field of signal processing and up till now more focus has been laid on fully supervised approaches. Over the last few years, focus has moved towards semi-supervised methods which concentrate on the utilization of unlabeled data, and self-supervised methods which learn the intermediate representation through pretext task or contrastive learning. However, both approaches require a vast amount of unlabelled data to improve performance. In this work, we propose a novel framework called Environmental Sound Classification with Hierarchical Ontology-guided semi-supervised Learning (ECHO) that utilizes label ontology-based hierarchy to learn semantic representation by defining a novel pretext task. In the pretext task, the model tries to predict coarse labels defined by the Large Language Model (LLM) based on ground truth label ontology. The trained model is further fine-tuned in a supervised way to predict the actual task. Our proposed novel semi-supervised framework achieves an accuracy improvement in the range of 1\% to 8\% over baseline systems across three datasets namely UrbanSound8K, ESC-10, and ESC-50.
Abstract:The careful planning and safe deployment of 5G technologies will bring enormous benefits to society and the economy. Higher frequency, beamforming, and small-cells are key technologies that will provide unmatched throughput and seamless connectivity to 5G users. Superficial knowledge of these technologies has raised concerns among the general public about the harmful effects of radiation. Several standardization bodies are active to put limits on the emissions which are based on a defined set of radiation measurement methodologies. However, due to the peculiarity of 5G such as dynamicity of the beams, network densification, Time Division Duplexing mode of operation, etc, using existing EMF measurement methods may provide inaccurate results. In this context, we discuss our experimental studies aimed towards the measurement of radiation caused by beam-based transmissions from a 5G base station equipped with an Active Antenna System(AAS). We elaborate on the shortcomings of current measurement methodologies and address several open questions. Next, we demonstrate that using user-specific downlink beamforming, not only better performance is achieved compared to non-beamformed downlink, but also the radiation in the vicinity of the intended user is significantly decreased. Further, we show that under weak reception conditions, an uplink transmission can cause significantly high radiation in the vicinity of the user equipment. We believe that our work will help in clearing several misleading concepts about the 5G EMF radiation effects. We conclude the work by providing guidelines to improve the methodology of EMF measurement by considering the spatiotemporal dynamicity of the 5G transmission.
Abstract:The Artificial Intelligence Satellite Telecommunications Testbed (AISTT), part of the ESA project SPAICE, is focused on the transformation of the satellite payload by using artificial intelligence (AI) and machine learning (ML) methodologies over available commercial off-the-shelf (COTS) AI chips for on-board processing. The objectives include validating artificial intelligence-driven SATCOM scenarios such as interference detection, spectrum sharing, radio resource management, decoding, and beamforming. The study highlights hardware selection and payload architecture. Preliminary results show that ML models significantly improve signal quality, spectral efficiency, and throughput compared to conventional payload. Moreover, the testbed aims to evaluate the performance and application of AI-capable COTS chips in onboard SATCOM contexts.
Abstract:Smartphones have become indispensable in our daily lives and can do almost everything, from communication to online shopping. However, with the increased usage, cybercrime aimed at mobile devices is rocketing. Smishing attacks, in particular, have observed a significant upsurge in recent years. This problem is further exacerbated by the perpetrator creating new deceptive websites daily, with an average life cycle of under 15 hours. This renders the standard practice of keeping a database of malicious URLs ineffective. To this end, we propose a novel on-device pipeline: COPS that intelligently identifies features of fraudulent messages and URLs to alert the user in real-time. COPS is a lightweight pipeline with a detection module based on the Disentangled Variational Autoencoder of size 3.46MB for smishing and URL phishing detection, and we benchmark it on open datasets. We achieve an accuracy of 98.15% and 99.5%, respectively, for both tasks, with a false negative and false positive rate of a mere 0.037 and 0.015, outperforming previous works with the added advantage of ensuring real-time alerts on resource-constrained devices.
Abstract:This paper delves into the application of Machine Learning (ML) techniques in the realm of 5G Non-Terrestrial Networks (5G-NTN), particularly focusing on symbol detection and equalization for the Physical Broadcast Channel (PBCH). As 5G-NTN gains prominence within the 3GPP ecosystem, ML offers significant potential to enhance wireless communication performance. To investigate these possibilities, we present ML-based models trained with both synthetic and real data from a real 5G over-the-satellite testbed. Our analysis includes examining the performance of these models under various Signal-to-Noise Ratio (SNR) scenarios and evaluating their effectiveness in symbol enhancement and channel equalization tasks. The results highlight the ML performance in controlled settings and their adaptability to real-world challenges, shedding light on the potential benefits of the application of ML in 5G-NTN.
Abstract:Emotion prediction is the field of study to understand human emotions. Existing methods focus on modalities like text, audio, facial expressions, etc., which could be private to the user. Emotion can be derived from the subject's psychological data as well. Various approaches that employ combinations of physiological sensors for emotion recognition have been proposed. Yet, not all sensors are simple to use and handy for individuals in their daily lives. Thus, we propose a system to predict user emotion using smartwatch sensors. We design a framework to collect ground truth in real-time utilizing a mix of English and regional language-based videos to invoke emotions in participants and collect the data. Further, we modeled the problem as binary classification due to the limited dataset size and experimented with multiple machine-learning models. We also did an ablation study to understand the impact of features including Heart Rate, Accelerometer, and Gyroscope sensor data on mood. From the experimental results, Multi-Layer Perceptron has shown a maximum accuracy of 93.75 percent for pleasant-unpleasant (high/low valence classification) moods.
Abstract:Interest in the integration of Terrestrial Networks (TN) and Non-Terrestrial Networks (NTN); primarily satellites; has been rekindled due to the potential of NTN to provide ubiquitous coverage. Especially with the peculiar and flexible physical layer properties of 5G-NR, now direct access to 5G services through satellites could become possible. However, the large Round-Trip Delays (RTD) in NTNs require a re-evaluation of the design of RLC and PDCP layers timers ( and associated buffers), in particular for the regenerative payload satellites which have limited computational resources, and hence need to be optimally utilized. Our aim in this work is to initiate a new line of research for emerging NTNs with limited resources from a higher-layer perspective. To this end, we propose a novel and efficient method for optimally designing the RLC and PDCP layers' buffers and timers without the need for intensive computations. This approach is relevant for low-cost satellites, which have limited computational and energy resources. The simulation results show that the proposed methods can significantly improve the performance in terms of resource utilization and delays.
Abstract:Intelligent reflecting surfaces (IRS) have emerged as a promising technology to enhance the performance of wireless communication systems. By actively manipulating the wireless propagation environment, IRS enables efficient signal transmission and reception. In recent years, the integration of IRS with full-duplex (FD) communication has garnered significant attention due to its potential to further improve spectral and energy efficiencies. IRS-assisted FD systems combine the benefits of both IRS and FD technologies, providing a powerful solution for the next generation of cellular systems. In this manuscript, we present a novel approach to jointly optimize active and passive beamforming in a multiple-input-multiple-output (MIMO) FD system assisted by an IRS for weighted sum rate (WSR) maximization. Given the inherent difficulty in obtaining perfect channel state information (CSI) in practical scenarios, we consider imperfect CSI and propose a statistically robust beamforming strategy to maximize the ergodic WSR. Additionally, we analyze the achievable WSR for an IRS-assisted MIMO FD system under imperfect CSI by deriving both the lower and upper bounds. To tackle the problem of ergodic WSR maximization, we employ the concept of expected weighted minimum mean squared error (EWMMSE), which exploits the information of the expected error covariance matrices and ensures convergence to a local optimum. We evaluate the effectiveness of our proposed design through extensive simulations. The results demonstrate that our robust approach yields significant performance improvements compared to the simplistic beamforming approach that disregards CSI errors, while also outperforming the robust half-duplex (HD) system considerably
Abstract:In this paper, a novel robust beamforming for an intelligent reflecting surface (IRS) assisted FD system is presented. Since perfect channel state information (CSI) is often challenging to acquire in practice, we consider the case of imperfect CSI and adopt a statistically robust beamforming approach to maximize the ergodic weighted sum rate (WSR). We also analyze the achievable WSR of an IRS-assisted FD with imperfect CSI, for which the lower and the upper bounds are derived. The ergodic WSR maximization problem is tackled based on the expected Weighted Minimum Mean Squared Error (WMMSE), which is guaranteed to converge to a local optimum. The effectiveness of the proposed design is investigated with extensive simulation results. It is shown that our robust design achieves significant performance gain compared to the naive beamforming approaches and considerably outperforms the robust Half-Duplex (HD) system.
Abstract:In this paper, we propose a sentiment-enriched lightweight network SeLiNet and an end-to-end on-device pipeline for contextual emotion recognition in images. SeLiNet model consists of body feature extractor, image aesthetics feature extractor, and learning-based fusion network which jointly estimates discrete emotion and human sentiments tasks. On the EMOTIC dataset, the proposed approach achieves an Average Precision (AP) score of 27.17 in comparison to the baseline AP score of 27.38 while reducing the model size by >85%. In addition, we report an on-device AP score of 26.42 with reduction in model size by >93% when compared to the baseline.