Abstract:With the rapid advancement in vehicular communications and intelligent transportation systems technologies, task offloading in vehicular networking scenarios is emerging as a promising, yet challenging, paradigm in mobile edge computing. In this paper, we study the computation offloading problem from mobile vehicles/users, more specifically, the network- and base station selection problem, in a heterogeneous Vehicular Edge Computing (VEC) scenario, where networks have different traffic loads. In a fast-varying vehicular environment, the latency in computation offloading that arises as a result of network congestion (e.g. at the edge computing servers co-located with the base stations) is a key performance metric. However, due to the non-stationary property of such environments, predicting network congestion is an involved task. To address this challenge, we propose an on-line algorithm and an off-policy learning algorithm based on bandit theory. To dynamically select the least congested network in a piece-wise stationary environment, from the offloading history, these algorithms learn the latency that the offloaded tasks experience. In addition, to minimize the task loss due to the mobility of the vehicles, we develop a method for base station selection and a relaying mechanism in the chosen network based on the sojourn time of the vehicles. Through extensive numerical analysis, we demonstrate that the proposed learning-based solutions adapt to the traffic changes of the network by selecting the least congested network. Moreover, the proposed approaches improve the latency of offloaded tasks.