Abstract:A new class of Multi-Rotor Aerial Vehicles (MRAVs), known as omnidirectional MRAVs (o-MRAVs), has attracted significant interest in the robotics community. These MRAVs have the unique capability of independently controlling their 3D position and 3D orientation. In the context of aerial communication networks, this translates into the ability to control the position and orientation of the antenna mounted on the MRAV without any additional devices tasked for antenna orientation. This additional Degrees of Freedom (DoF) adds a new dimension to aerial communication systems, creating various research opportunities in communications-aware trajectory planning and positioning. This paper presents this new class of MRAVs and discusses use cases in areas such as physical layer security and optical communications. Furthermore, the benefits of these MRAVs are illustrated with realistic simulation scenarios. Finally, new research problems and opportunities introduced by this advanced robotics technology are discussed.
Abstract:Asthma rates have risen globally, driven by environmental and lifestyle factors. Access to immediate medical care is limited, particularly in developing countries, necessitating automated support systems. Large Language Models like ChatGPT (Chat Generative Pre-trained Transformer) and Gemini have advanced natural language processing in general and question answering in particular, however, they are prone to producing factually incorrect responses (i.e. hallucinations). Retrieval-augmented generation systems, integrating curated documents, can improve large language models' performance and reduce the incidence of hallucination. We introduce AsthmaBot, a multi-lingual, multi-modal retrieval-augmented generation system for asthma support. Evaluation of an asthma-related frequently asked questions dataset shows AsthmaBot's efficacy. AsthmaBot has an added interactive and intuitive interface that integrates different data modalities (text, images, videos) to make it accessible to the larger public. AsthmaBot is available online via \url{asthmabot.datanets.org}.
Abstract:Recent speech technologies have led to produce high quality synthesised speech due to recent advances in neural Text to Speech (TTS). However, such TTS models depend on extensive amounts of data that can be costly to produce and is hardly scalable to all existing languages, especially that seldom attention is given to low resource languages. With techniques such as knowledge transfer, the burden of creating datasets can be alleviated. In this paper, we therefore investigate two aspects; firstly, whether data from social media can be used for a small TTS dataset construction, and secondly whether cross lingual transfer learning (TL) for a low resource language can work with this type of data. In this aspect, we specifically assess to what extent multilingual modeling can be leveraged as an alternative to training on monolingual corporas. To do so, we explore how data from foreign languages may be selected and pooled to train a TTS model for a target low resource language. Our findings show that multilingual pre-training is better than monolingual pre-training at increasing the intelligibility and naturalness of the generated speech.
Abstract:Federated learning (FL) involves several clients that share with a fusion center (FC), the model each client has trained with its own data. Conventional FL, which can be interpreted as an estimation or distortion-based approach, ignores the final use of model information (MI) by the FC and the other clients. In this paper, we introduce a novel FL framework in which the FC uses an aggregate version of the MI to make decisions that affect the client's utility functions. Clients cannot choose the decisions and can only use the MI reported to the FC to maximize their utility. Depending on the alignment between the client and FC utilities, the client may have an individual interest in adding strategic noise to the model. This general framework is stated and specialized to the case of clustering, in which noisy cluster representative information is reported. This is applied to the problem of power consumption scheduling. In this context, utility non-alignment occurs, for instance, when the client wants to consume when the price of electricity is low, whereas the FC wants the consumption to occur when the total power is the lowest. This is illustrated with aggregated real data from Ausgrid \cite{ausgrid}. Our numerical analysis clearly shows that the client can increase his utility by adding noise to the model reported to the FC. Corresponding results and source codes can be downloaded from \cite{source-code}.
Abstract:We delve into the issue of node classification within graphs, specifically reevaluating the concept of neighborhood aggregation, which is a fundamental component in graph neural networks (GNNs). Our analysis reveals conceptual flaws within certain benchmark GNN models when operating under the assumption of edge-independent node labels, a condition commonly observed in benchmark graphs employed for node classification. Approaching neighborhood aggregation from a statistical signal processing perspective, our investigation provides novel insights which may be used to design more efficient GNN models.
Abstract:Bladder cancer ranks within the top 10 most diagnosed cancers worldwide and is among the most expensive cancers to treat due to the high recurrence rates which require lifetime follow-ups. The primary tool for diagnosis is cystoscopy, which heavily relies on doctors' expertise and interpretation. Therefore, annually, numerous cases are either undiagnosed or misdiagnosed and treated as urinary infections. To address this, we suggest a deep learning approach for bladder cancer detection and segmentation which combines CNNs with a lightweight positional-encoding-free transformer and dual attention gates that fuse self and spatial attention for feature enhancement. The architecture suggested in this paper is efficient making it suitable for medical scenarios that require real time inference. Experiments have proven that this model addresses the critical need for a balance between computational efficiency and diagnostic accuracy in cystoscopic imaging as despite its small size it rivals large models in performance.
Abstract:The integration of Multi-Rotor Aerial Vehicles (MRAVs) into 5G and 6G networks enhances coverage, connectivity, and congestion management. This fosters communication-aware robotics, exploring the interplay between robotics and communications, but also makes the MRAVs susceptible to malicious attacks, such as jamming. One traditional approach to counter these attacks is the use of beamforming on the MRAVs to apply physical layer security techniques. In this paper, we explore pose optimization as an alternative approach to countering jamming attacks on MRAVs. This technique is intended for omnidirectional MRAVs, which are drones capable of independently controlling both their position and orientation, as opposed to the more common underactuated MRAVs whose orientation cannot be controlled independently of their position. In this paper, we consider an omnidirectional MRAV serving as a Base Station (BS) for legitimate ground nodes, under attack by a malicious jammer. We optimize the MRAV pose (i.e., position and orientation) to maximize the minimum Signal-to-Interference-plus-Noise Ratio (SINR) over all legitimate nodes.
Abstract:According to the World Health Organization (WHO), air pollution kills seven million people every year. Outdoor air pollution is a major environmental health problem affecting low, middle, and high-income countries. In the past few years, the research community has explored IoT-enabled machine learning applications for outdoor air pollution prediction. The general objective of this paper is to systematically review applications of machine learning and Internet of Things (IoT) for outdoor air pollution prediction and the combination of monitoring sensors and input features used. Two research questions were formulated for this review. 1086 publications were collected in the initial PRISMA stage. After the screening and eligibility phases, 37 papers were selected for inclusion. A cost-based analysis was conducted on the findings to highlight high-cost monitoring, low-cost IoT and hybrid enabled prediction. Three methods of prediction were identified: time series, feature-based and spatio-temporal. This review's findings identify major limitations in applications found in the literature, namely lack of coverage, lack of diversity of data and lack of inclusion of context-specific features. This review proposes directions for future research and underlines practical implications in healthcare, urban planning, global synergy and smart cities.
Abstract:Internet of Things (IoT) sensors are nowadays heavily utilized in various real-world applications ranging from wearables to smart buildings passing by agrotechnology and health monitoring. With the huge amounts of data generated by these tiny devices, Deep Learning (DL) models have been extensively used to enhance them with intelligent processing. However, with the urge for smaller and more accurate devices, DL models became too heavy to deploy. It is thus necessary to incorporate the hardware's limited resources in the design process. Therefore, inspired by the human brain known for its efficiency and low power consumption, we propose a shallow bidirectional network based on predictive coding theory and dynamic early exiting for halting further computations when a performance threshold is surpassed. We achieve comparable accuracy to VGG-16 in image classification on CIFAR-10 with fewer parameters and less computational complexity.
Abstract:Offline Reinforcement Learning (RL) is structured to derive policies from static trajectory data without requiring real-time environment interactions. Recent studies have shown the feasibility of framing offline RL as a sequence modeling task, where the sole aim is to predict actions based on prior context using the transformer architecture. However, the limitation of this single task learning approach is its potential to undermine the transformer model's attention mechanism, which should ideally allocate varying attention weights across different tokens in the input context for optimal prediction. To address this, we reformulate offline RL as a multi-objective optimization problem, where the prediction is extended to states and returns. We also highlight a potential flaw in the trajectory representation used for sequence modeling, which could generate inaccuracies when modeling the state and return distributions. This is due to the non-smoothness of the action distribution within the trajectory dictated by the behavioral policy. To mitigate this issue, we introduce action space regions to the trajectory representation. Our experiments on D4RL benchmark locomotion tasks reveal that our propositions allow for more effective utilization of the attention mechanism in the transformer model, resulting in performance that either matches or outperforms current state-of-the art methods.