Integrated into existing Mobile Edge Computing (MEC) systems, Unmanned Aerial Vehicles (UAVs) serve as a cornerstone in meeting the stringent requirements of future Internet of Things (IoT) networks. The current endeavor studies an MEC system, in which a computationally-empowered UAV, wirelessly linked to a cloud server, is destined for task offloading in uplink transmission of IoT devices. The performance of this system is studied by formulating a resource allocation problem, which aims to maximize the long-term computed task efficiency, while ensuring the stability of task buffers at the IoT devices, UAV and cloud. The problem jointly optimizes the uplink transmit power of IoT devices and their offloading decisions, the trajectory of the UAV and computing power at all transceivers. Regarding the non-convex and stochastic nature of the problem, we devise a multi-step solution approach. Initially, by invoking the fractional programming and Lyapunov theory, we transform the long-term optimization problem into an equivalent per-time-slot form. Subsequently, we recast the reformulated problem as a Markov Decision Process (MDP), which reflects the network dynamics. The MDP model, eventually, serves for training a Meta Twin Delayed Deep Deterministic Policy Gradient (MTD3) agent, in charge of adaptive resource allocation with respect to the MEC system variations derived from the mobility of the UAV and IoT devices. Simulations reveal the dominance of our proposed resource allocation approach over its Deep Reinforcement Learning (DRL)-powered counterparts, increasing computed task efficiency and reducing task buffer lengths.