Abstract:Mobile systems will have to support multiple AI-based applications, each leveraging heterogeneous data sources through DNN architectures collaboratively executed within the network. To minimize the cost of the AI inference task subject to requirements on latency, quality, and - crucially - reliability of the inference process, it is vital to optimize (i) the set of sensors/data sources and (ii) the DNN architecture, (iii) the network nodes executing sections of the DNN, and (iv) the resources to use. To this end, we leverage dynamic gated neural networks with branches, and propose a novel algorithmic strategy called Quantile-constrained Inference (QIC), based upon quantile-Constrained policy optimization. QIC makes joint, high-quality, swift decisions on all the above aspects of the system, with the aim to minimize inference energy cost. We remark that this is the first contribution connecting gated dynamic DNNs with infrastructure-level decision making. We evaluate QIC using a dynamic gated DNN with stems and branches for optimal sensor fusion and inference, trained on the RADIATE dataset offering Radar, LiDAR, and Camera data, and real-world wireless measurements. Our results confirm that QIC matches the optimum and outperforms its alternatives by over 80%.
Abstract:Motivated by the proliferation of Internet-of-Thing (IoT) devices and the rapid advances in the field of deep learning, there is a growing interest in pushing deep learning computations, conventionally handled by the cloud, to the edge of the network to deliver faster responses to end users, reduce bandwidth consumption to the cloud, and address privacy concerns. However, to fully realize deep learning at the edge, two main challenges still need to be addressed: (i) how to meet the high resource requirements of deep learning on resource-constrained devices, and (ii) how to leverage the availability of multiple streams of spatially correlated data, to increase the effectiveness of deep learning and improve application-level performance. To address the above challenges, we explore collaborative inference at the edge, in which edge nodes and end devices share correlated data and the inference computational burden by leveraging different ways to split computation and fuse data. Besides traditional centralized and distributed schemes for edge-end device collaborative inference, we introduce selective schemes that decrease bandwidth resource consumption by effectively reducing data redundancy. As a reference scenario, we focus on multi-view classification in a networked system in which sensing nodes can capture overlapping fields of view. The proposed schemes are compared in terms of accuracy, computational expenditure at the nodes, communication overhead, inference latency, robustness, and noise sensitivity. Experimental results highlight that selective collaborative schemes can achieve different trade-offs between the above performance metrics, with some of them bringing substantial communication savings (from 18% to 74% of the transmitted data with respect to centralized inference) while still keeping the inference accuracy well above 90%.
Abstract:Conventional route planning services typically offer the same routes to all drivers, focusing primarily on a few standardized factors such as travel distance or time, overlooking individual driver preferences. With the inception of autonomous vehicles expected in the coming years, where vehicles will rely on routes decided by such planners, there arises a need to incorporate the specific preferences of each driver, ensuring personalized navigation experiences. In this work, we propose a novel approach based on graph neural networks (GNNs) and deep reinforcement learning (DRL), aimed at customizing routes to suit individual preferences. By analyzing the historical trajectories of individual drivers, we classify their driving behavior and associate it with relevant road attributes as indicators of driver preferences. The GNN is capable of representing the road network as graph-structured data effectively, while DRL is capable of making decisions utilizing reward mechanisms to optimize route selection with factors such as travel costs, congestion level, and driver satisfaction. We evaluate our proposed GNN-based DRL framework using a real-world road network and demonstrate its ability to accommodate driver preferences, offering a range of route options tailored to individual drivers. The results indicate that our framework can select routes that accommodate driver's preferences with up to a 17% improvement compared to a generic route planner, and reduce the travel time by 33% (afternoon) and 46% (evening) relatively to the shortest distance-based approach.
Abstract:In pursuit of autonomous vehicles, achieving human-like driving behavior is vital. This study introduces adaptive autopilot (AA), a unique framework utilizing constrained-deep reinforcement learning (C-DRL). AA aims to safely emulate human driving to reduce the necessity for driver intervention. Focusing on the car-following scenario, the process involves (i) extracting data from the highD natural driving study and categorizing it into three driving styles using a rule-based classifier; (ii) employing deep neural network (DNN) regressors to predict human-like acceleration across styles; and (iii) using C-DRL, specifically the soft actor-critic Lagrangian technique, to learn human-like safe driving policies. Results indicate effectiveness in each step, with the rule-based classifier distinguishing driving styles, the regressor model accurately predicting acceleration, outperforming traditional car-following models, and C-DRL agents learning optimal policies for humanlike driving across styles.
Abstract:Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learning pipeline, especially in resource- and/or bandwidth-constrained scenarios. In this context, the pruning technique, i.e., how to choose the parameters to remove, is critical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.
Abstract:Intersection crossing represents one of the most dangerous sections of the road infrastructure and Connected Vehicles (CVs) can serve as a revolutionary solution to the problem. In this work, we present a novel framework that detects preemptively collisions at urban crossroads, exploiting the Multi-access Edge Computing (MEC) platform of 5G networks. At the MEC, an Intersection Manager (IM) collects information from both vehicles and the road infrastructure to create a holistic view of the area of interest. Based on the historical data collected, the IM leverages the capabilities of an encoder-decoder recurrent neural network to predict, with high accuracy, the future vehicles' trajectories. As, however, accuracy is not a sufficient measure of how much we can trust a model, trajectory predictions are additionally associated with a measure of uncertainty towards confident collision forecasting and avoidance. Hence, contrary to any other approach in the state of the art, an uncertainty-aware collision prediction framework is developed that is shown to detect well in advance (and with high reliability) if two vehicles are on a collision course. Subsequently, collision detection triggers a number of alarms that signal the colliding vehicles to brake. Under real-world settings, thanks to the preemptive capabilities of the proposed approach, all the simulated imminent dangers are averted.
Abstract:The increasing pervasiveness of intelligent mobile applications requires to exploit the full range of resources offered by the mobile-edge-cloud network for the execution of inference tasks. However, due to the heterogeneity of such multi-tiered networks, it is essential to make the applications' demand amenable to the available resources while minimizing energy consumption. Modern dynamic deep neural networks (DNN) achieve this goal by designing multi-branched architectures where early exits enable sample-based adaptation of the model depth. In this paper, we tackle the problem of allocating sections of DNNs with early exits to the nodes of the mobile-edge-cloud system. By envisioning a 3-stage graph-modeling approach, we represent the possible options for splitting the DNN and deploying the DNN blocks on the multi-tiered network, embedding both the system constraints and the application requirements in a convenient and efficient way. Our framework -- named Feasible Inference Graph (FIN) -- can identify the solution that minimizes the overall inference energy consumption while enabling distributed inference over the multi-tiered network with the target quality and latency. Our results, obtained for DNNs with different levels of complexity, show that FIN matches the optimum and yields over 65% energy savings relative to a state-of-the-art technique for cost minimization.
Abstract:The existing work on the distributed training of machine learning (ML) models has consistently overlooked the distribution of the achieved learning quality, focusing instead on its average value. This leads to a poor dependability}of the resulting ML models, whose performance may be much worse than expected. We fill this gap by proposing DepL, a framework for dependable learning orchestration, able to make high-quality, efficient decisions on (i) the data to leverage for learning, (ii) the models to use and when to switch among them, and (iii) the clusters of nodes, and the resources thereof, to exploit. For concreteness, we consider as possible available models a full DNN and its compressed versions. Unlike previous studies, DepL guarantees that a target learning quality is reached with a target probability, while keeping the training cost at a minimum. We prove that DepL has constant competitive ratio and polynomial complexity, and show that it outperforms the state-of-the-art by over 27% and closely matches the optimum.
Abstract:We consider the problem of accurately localizing N unmanned aerial vehicles (UAV) in 3D space where the UAVs are part of a swarm and communicate with each other through orthogonal time-frequency space (OTFS) modulated signals. Each receiving UAV estimates the multipath wireless channel on each link formed by the line-of-sight (LoS) transmission and by the single reflections from the remaining N-2 UAVs. The estimated power delay profiles are communicated to an edge server, which is in charge of computing the exact location and speed of the UAVs. To obtain the UAVs locations and velocities, we propose an iterative algorithm, named Turbo Iterative Positioning (TIP), which, using a belief-propagation approach, effectively exploits the time difference of arrival (TDoA) measurements between the LoS and the non-LoS paths. Enabling a full cold start (no prior knowledge), our solution first maps each TDoA's profile element to a specific ID of the reflecting UAV's. The Doppler shifts measured by the OTFS receivers associated with each path are also used to estimate the UAV's velocities. The localization of the N UAVs is then derived via gradient descent optimization, with the aid of turbo-like iterations that can progressively correct some of the residual errors in the initial ID mapping operation. Our numerical results, obtained also using real-world traces, show how the multipath links are beneficial to achieving very accurate localization and speed of all UAVs, even with a limited delay-Doppler resolution. Robustness of our scheme is proven by its performance approaching the Cramer-Rao bound.
Abstract:Since the start of 5G work in 3GPP in early 2016, tremendous progress has been made in both standardization and commercial deployments. 3GPP is now entering the second phase of 5G standardization, known as 5G-Advanced, built on the 5G baseline in 3GPP Releases 15, 16, and 17. 3GPP Release 18, the start of 5G-Advanced, includes a diverse set of features that cover both device and network evolutions, providing balanced mobile broadband evolution and further vertical domain expansion and accommodating both immediate and long-term commercial needs. 5G-Advanced will significantly expand 5G capabilities, address many new use cases, transform connectivity experiences, and serve as an essential step in developing mobile communications towards 6G. This paper provides a comprehensive overview of the 3GPP 5G-Advanced development, introducing the prominent state-of-the-art technologies investigated in 3GPP and identifying key evolution directions for future research and standardization.