Abstract:Many existing obstacle avoidance algorithms overlook the crucial balance between safety and agility, especially in environments of varying complexity. In our study, we introduce an obstacle avoidance pipeline based on reinforcement learning. This pipeline enables drones to adapt their flying speed according to the environmental complexity. Moreover, to improve the obstacle avoidance performance in cluttered environments, we propose a novel latent space. The latent space in this representation is explicitly trained to retain memory of previous depth map observations. Our findings confirm that varying speed leads to a superior balance of success rate and agility in cluttered environments. Additionally, our memory-augmented latent representation outperforms the latent representation commonly used in reinforcement learning. Finally, after minimal fine-tuning, we successfully deployed our network on a real drone for enhanced obstacle avoidance.
Abstract:Aggressive time-optimal control of quadcopters poses a significant challenge in the field of robotics. The state-of-the-art approach leverages reinforcement learning (RL) to train optimal neural policies. However, a critical hurdle is the sim-to-real gap, often addressed by employing a robust inner loop controller -an abstraction that, in theory, constrains the optimality of the trained controller, necessitating margins to counter potential disturbances. In contrast, our novel approach introduces high-speed quadcopter control using end-to-end RL (E2E) that gives direct motor commands. To bridge the reality gap, we incorporate a learned residual model and an adaptive method that can compensate for modeling errors in thrust and moments. We compare our E2E approach against a state-of-the-art network that commands thrust and body rates to an INDI inner loop controller, both in simulated and real-world flight. E2E showcases a significant 1.39-second advantage in simulation and a 0.17-second edge in real-world testing, highlighting end-to-end reinforcement learning's potential. The performance drop observed from simulation to reality shows potential for further improvement, including refining strategies to address the reality gap or exploring offline reinforcement learning with real flight data.
Abstract:Spacecraft and drones aimed at exploring our solar system are designed to operate in conditions where the smart use of onboard resources is vital to the success or failure of the mission. Sensorimotor actions are thus often derived from high-level, quantifiable, optimality principles assigned to each task, utilizing consolidated tools in optimal control theory. The planned actions are derived on the ground and transferred onboard where controllers have the task of tracking the uploaded guidance profile. Here we argue that end-to-end neural guidance and control architectures (here called G&CNets) allow transferring onboard the burden of acting upon these optimality principles. In this way, the sensor information is transformed in real time into optimal plans thus increasing the mission autonomy and robustness. We discuss the main results obtained in training such neural architectures in simulation for interplanetary transfers, landings and close proximity operations, highlighting the successful learning of optimality principles by the neural model. We then suggest drone racing as an ideal gym environment to test these architectures on real robotic platforms, thus increasing confidence in their utilization on future space exploration missions. Drone racing shares with spacecraft missions both limited onboard computational capabilities and similar control structures induced from the optimality principle sought, but it also entails different levels of uncertainties and unmodelled effects. Furthermore, the success of G&CNets on extremely resource-restricted drones illustrates their potential to bring real-time optimal control within reach of a wider variety of robotic systems, both in space and on Earth.
Abstract:Reaching fast and autonomous flight requires computationally efficient and robust algorithms. To this end, we train Guidance & Control Networks to approximate optimal control policies ranging from energy-optimal to time-optimal flight. We show that the policies become more difficult to learn the closer we get to the time-optimal 'bang-bang' control profile. We also assess the importance of knowing the maximum angular rotor velocity of the quadcopter and show that over- or underestimating this limit leads to less robust flight. We propose an algorithm to identify the current maximum angular rotor velocity onboard and a network that adapts its policy based on the identified limit. Finally, we extend previous work on Guidance & Control Networks by learning to take consecutive waypoints into account. We fly a 4x3m track in similar lap times as the differential-flatness-based minimum snap benchmark controller while benefiting from the flexibility that Guidance & Control Networks offer.
Abstract:Developing optimal controllers for aggressive high-speed quadcopter flight is a major challenge in the field of robotics. Recent work has shown that neural networks trained with supervised learning can achieve real-time optimal control in some specific scenarios. In these methods, the networks (termed G&CNets) are trained to learn the optimal state feedback from a dataset of optimal trajectories. An important problem with these methods is the reality gap encountered in the sim-to-real transfer. In this work, we trained G&CNets for energy-optimal end-to-end control on the Bebop drone and identified the unmodeled pitch moment as the main contributor to the reality gap. To mitigate this, we propose an adaptive control strategy that works by learning from optimal trajectories of a system affected by constant external pitch, roll and yaw moments. In real test flights, this model mismatch is estimated onboard and fed to the network to obtain the optimal rpm command. We demonstrate the effectiveness of our method by performing energy-optimal hover-to-hover flights with and without moment feedback. Finally, we compare the adaptive controller to a state-of-the-art differential-flatness-based controller in a consecutive waypoint flight and demonstrate the advantages of our method in terms of energy optimality and robustness.
Abstract:Neuromorphic processing promises high energy efficiency and rapid response rates, making it an ideal candidate for achieving autonomous flight of resource-constrained robots. It will be especially beneficial for complex neural networks as are involved in high-level visual perception. However, fully neuromorphic solutions will also need to tackle low-level control tasks. Remarkably, it is currently still challenging to replicate even basic low-level controllers such as proportional-integral-derivative (PID) controllers. Specifically, it is difficult to incorporate the integral and derivative parts. To address this problem, we propose a neuromorphic controller that incorporates proportional, integral, and derivative pathways during learning. Our approach includes a novel input threshold adaptation mechanism for the integral pathway. This Input-Weighted Threshold Adaptation (IWTA) introduces an additional weight per synaptic connection, which is used to adapt the threshold of the post-synaptic neuron. We tackle the derivative term by employing neurons with different time constants. We first analyze the performance and limits of the proposed mechanisms and then put our controller to the test by implementing it on a microcontroller connected to the open-source tiny Crazyflie quadrotor, replacing the innermost rate controller. We demonstrate the stability of our bio-inspired algorithm with flights in the presence of disturbances. The current work represents a substantial step towards controlling highly dynamic systems with neuromorphic algorithms, thus advancing neuromorphic processing and robotics. In addition, integration is an important part of any temporal task, so the proposed Input-Weighted Threshold Adaptation (IWTA) mechanism may have implications well beyond control tasks.
Abstract:Event cameras have recently gained significant traction since they open up new avenues for low-latency and low-power solutions to complex computer vision problems. To unlock these solutions, it is necessary to develop algorithms that can leverage the unique nature of event data. However, the current state-of-the-art is still highly influenced by the frame-based literature, and usually fails to deliver on these promises. In this work, we take this into consideration and propose a novel self-supervised learning pipeline for the sequential estimation of event-based optical flow that allows for the scaling of the models to high inference frequencies. At its core, we have a continuously-running stateful neural model that is trained using a novel formulation of contrast maximization that makes it robust to nonlinearities and varying statistics in the input events. Results across multiple datasets confirm the effectiveness of our method, which establishes a new state of the art in terms of accuracy for approaches trained or optimized without ground truth.
Abstract:Obstacle avoidance is an essential topic in the field of autonomous drone research. When choosing an avoidance algorithm, many different options are available, each with their advantages and disadvantages. As there is currently no consensus on testing methods, it is quite challenging to compare the performance between algorithms. In this paper, we propose AvoidBench, a benchmarking suite which can evaluate the performance of vision-based obstacle avoidance algorithms by subjecting them to a series of tasks. Thanks to the high fidelity of multi-rotors dynamics from RotorS and virtual scenes of Unity3D, AvoidBench can realize realistic simulated flight experiments. Compared to current drone simulators, we propose and implement both performance and environment metrics to reveal the suitability of obstacle avoidance algorithms for environments of different complexity. To illustrate AvoidBench's usage, we compare three algorithms: Ego-planner, MBPlanner, and Agile-autonomy. The trends observed are validated with real-world obstacle avoidance experiments.
Abstract:This paper discusses a low-cost, open-source and open-hardware design and performance evaluation of a low-speed, multi-fan wind system dedicated to micro air vehicle (MAV) testing. In addition, a set of experiments with a flapping wing MAV and rotorcraft is presented, demonstrating the capabilities of the system and the properties of these different types of drones in response to various types of wind. We performed two sets of experiments where a MAV is flying into the wake of the fan system, gathering data about states, battery voltage and current. Firstly, we focus on steady wind conditions with wind speeds ranging from 0.5 m/s to 3.4 m/s. During the second set of experiments, we introduce wind gusts, by periodically modulating the wind speed from 1.3 m/s to 3.4 m/s with wind gust oscillations of 0.5 Hz, 0.25 Hz and 0.125 Hz. The "Flapper" flapping wing MAV requires much larger pitch angles to counter wind than the "CrazyFlie" quadrotor. This is due to the Flapper's larger wing surface. In forward flight, its wings do provide extra lift, considerably reducing the power consumption. In contrast, the CrazyFlie's power consumption stays more constant for different wind speeds. The experiments with the varying wind show a quicker gust response by the CrazyFlie compared with the Flapper drone, but both their responses could be further improved. We expect that the proposed wind gust system will provide a useful tool to the community to achieve such improvements.
Abstract:Robotics is the next frontier in the progress of Artificial Intelligence (AI), as the real world in which robots operate represents an enormous, complex, continuous state space with inherent real-time requirements. One extreme challenge in robotics is currently formed by autonomous drone racing. Human drone racers can fly through complex tracks at speeds of up to 190 km/h. Achieving similar speeds with autonomous drones signifies tackling fundamental problems in AI under extreme restrictions in terms of resources. In this article, we present the winning solution of the first AI Robotic Racing (AIRR) Circuit, a competition consisting of four races in which all participating teams used the same drone, to which they had limited access. The core of our approach is inspired by how human pilots combine noisy observations of the race gates with their mental model of the drone's dynamics to achieve fast control. Our approach has a large focus on gate detection with an efficient deep neural segmentation network and active vision. Further, we make contributions to robust state estimation and risk-based control. This allowed us to reach speeds of ~9.2m/s in the last race, unrivaled by previous autonomous drone race competitions. Although our solution was the fastest and most robust, it still lost against one of the best human pilots, Gab707. The presented approach indicates a promising direction to close the gap with human drone pilots, forming an important step in bringing AI to the real world.