Abstract:Multi-task imitation learning (MTIL) has shown significant potential in robotic manipulation by enabling agents to perform various tasks using a unified policy. This simplifies the policy deployment and enhances the agent's adaptability across different contexts. However, key challenges remain, such as maintaining action reliability (e.g., avoiding abnormal action sequences that deviate from nominal task trajectories), distinguishing between similar tasks, and generalizing to unseen scenarios. To address these challenges, we introduce the Foresight-Augmented Manipulation Policy (FoAM), an innovative MTIL framework. FoAM not only learns to mimic expert actions but also predicts the visual outcomes of those actions to enhance decision-making. Additionally, it integrates multi-modal goal inputs, such as visual and language prompts, overcoming the limitations of single-conditioned policies. We evaluated FoAM across over 100 tasks in both simulation and real-world settings, demonstrating that it significantly improves IL policy performance, outperforming current state-of-the-art IL baselines by up to 41% in success rate. Furthermore, we released a simulation benchmark for robotic manipulation, featuring 10 task suites and over 80 challenging tasks designed for multi-task policy training and evaluation. See project homepage https://projFoAM.github.io/ for project details.
Abstract:In this paper, we propose to deploy multiple unmanned aerial vehicle (UAV) mounted base stations to serve ground users in outdoor environments with obstacles. In particular, the geographic information is employed to capture the blockage effects for air-to-ground (A2G) links caused by buildings, and a realistic blockage-aware A2G channel model is proposed to characterize the continuous variation of the channels at different locations. Based on the proposed channel model, we formulate the joint optimization problem of UAV three-dimensional (3-D) positioning and resource allocation, by power allocation, user association, and subcarrier allocation, to maximize the minimum achievable rate among users. To solve this non-convex combinatorial programming problem, we introduce a penalty term to relax it and develop a suboptimal solution via a penalty-based double-loop iterative optimization framework. The inner loop solves the penalized problem by employing the block successive convex approximation (BSCA) technique, where the UAV positioning and resource allocation are alternately optimized in each iteration. The outer loop aims to obtain proper penalty multipliers to ensure the solution of the penalized problem converges to that of the original problem. Simulation results demonstrate the superiority of the proposed algorithm over other benchmark schemes in terms of the minimum achievable rate.
Abstract:In this paper, we study to employ geographic information to address the blockage problem of air-to-ground links between UAV and terrestrial nodes. In particular, a UAV relay is deployed to establish communication links from a ground base station to multiple ground users. To improve communication capacity, we first model the blockage effect caused by buildings according to the three-dimensional (3-D) geographic information. Then, an optimization problem is formulated to maximize the minimum capacity among users by jointly optimizing the 3-D position and power allocation of the UAV relay, under the constraints of link capacity, maximum transmit power, and blockage. To solve this complex non-convex problem, a two-loop optimization framework is developed based on Lagrangian relaxation. The outer-loop aims to obtain proper Lagrangian multipliers to ensure the solution of the Lagrangian problem converge to the tightest upper bound on the original problem. The inner-loop solves the Lagrangian problem by applying the block coordinate descent (BCD) and successive convex approximation (SCA) techniques, where UAV 3-D positioning and power allocation are alternately optimized in each iteration. Simulation results confirm that the proposed solution significantly outperforms two benchmark schemes and achieves a performance close to the upper bound on the UAV relay system.
Abstract:Unmanned aerial vehicles (UAVs) have found widespread commercial, civilian, and military applications. Wireless communication has always been one of the core technologies for UAV. However, the communication capacity is becoming a bottleneck for UAV to support more challenging application scenarios. The heavily-occupied sub-6 GHz frequency band is not sufficient to meet the ultra high-data-traffic requirements. The utilization of the millimeter-wave (mmWave) frequency bands is a promising direction for UAV communications, where large antenna arrays can be packed in a small area on the UAV to perform three-dimensional (3D) beamforming. On the other hand, UAVs serving as aerial access points or relays can significantly enhance the coverage and quality of service of the terrestrial mmWave cellular networks. In this paper, we provide a comprehensive survey on mmWave beamforming enabled UAV communications and networking. The technical potential of and challenges for mmWave-UAV communications are presented first. Then, we provide an overview on relevant mmWave antenna structures and channel modeling. Subsequently, the technologies and solutions for UAV-connected mmWave cellular networks and mmWave-UAV ad hoc networks are reviewed, respectively. Finally, we present open issues and promising directions for future research in mmWave beamforming enabled UAV communications and networking.