Abstract:The swift advancement of unmanned aerial vehicle (UAV) technologies necessitates new standards for developing human-drone interaction (HDI) interfaces. Most interfaces for HDI, especially first-person view (FPV) goggles, limit the operator's ability to obtain information from the environment. This paper presents a novel interface, FlightAR, that integrates augmented reality (AR) overlays of UAV first-person view (FPV) and bottom camera feeds with head-mounted display (HMD) to enhance the pilot's situational awareness. Using FlightAR, the system provides pilots not only with a video stream from several UAV cameras simultaneously, but also the ability to observe their surroundings in real time. User evaluation with NASA-TLX and UEQ surveys showed low physical demand ($\mu=1.8$, $SD = 0.8$) and good performance ($\mu=3.4$, $SD = 0.8$), proving better user assessments in comparison with baseline FPV goggles. Participants also rated the system highly for stimulation ($\mu=2.35$, $SD = 0.9$), novelty ($\mu=2.1$, $SD = 0.9$) and attractiveness ($\mu=1.97$, $SD = 1$), indicating positive user experiences. These results demonstrate the potential of the system to improve UAV piloting experience through enhanced situational awareness and intuitive control. The code is available here: https://github.com/Sautenich/FlightAR
Abstract:In the area of multi-drone systems, navigating through dynamic environments from start to goal while providing collision-free trajectory and efficient path planning is a significant challenge. To solve this problem, we propose a novel SwarmPath technology that involves the integration of Artificial Potential Field (APF) with Impedance Controller. The proposed approach provides a solution based on collision free leader-follower behaviour where drones are able to adapt themselves to the environment. Moreover, the leader is virtual while drones are physical followers leveraging APF path planning approach to find the smallest possible path to the target. Simultaneously, the drones dynamically adjust impedance links, allowing themselves to create virtual links with obstacles to avoid them. As compared to conventional APF, the proposed SwarmPath system not only provides smooth collision-avoidance but also enable agents to efficiently pass through narrow passages by reducing the total travel time by 30% while ensuring safety in terms of drones connectivity. Lastly, the results also illustrate that the discrepancies between simulated and real environment, exhibit an average absolute percentage error (APE) of 6% of drone trajectories. This underscores the reliability of our solution in real-world scenarios.
Abstract:The paper focuses on the algorithm for improving the quality of 3D reconstruction and segmentation in DSP-SLAM by enhancing the RGB image quality. SharpSLAM algorithm developed by us aims to decrease the influence of high dynamic motion on visual object-oriented SLAM through image deblurring, improving all aspects of object-oriented SLAM, including localization, mapping, and object reconstruction. The experimental results revealed noticeable improvement in object detection quality, with F-score increased from 82.9% to 86.2% due to the higher number of features and corresponding map points. The RMSE of signed distance function has also decreased from 17.2 to 15.4 cm. Furthermore, our solution has enhanced object positioning, with an increase in the IoU from 74.5% to 75.7%. SharpSLAM algorithm has the potential to highly improve the quality of 3D reconstruction and segmentation in DSP-SLAM and to impact a wide range of fields, including robotics, autonomous vehicles, and augmented reality.
Abstract:The shape of deformable objects can change drastically during grasping by robotic grippers, causing an ambiguous perception of their alignment and hence resulting in errors in robot positioning and telemanipulation. Rendering clear tactile patterns is fundamental to increasing users' precision and dexterity through tactile haptic feedback during telemanipulation. Therefore, different methods have to be studied to decode the sensors' data into haptic stimuli. This work presents a telemanipulation system for plastic pipettes that consists of a Force Dimension Omega.7 haptic interface endowed with two electro-stimulation arrays and two tactile sensor arrays embedded in the 2-finger Robotiq gripper. We propose a novel approach based on convolutional neural networks (CNN) to detect the tilt of deformable objects. The CNN generates a tactile pattern based on recognized tilt data to render further electro-tactile stimuli provided to the user during the telemanipulation. The study has shown that using the CNN algorithm, tilt recognition by users increased from 23.13\% with the downsized data to 57.9%, and the success rate during teleoperation increased from 53.12% using the downsized data to 92.18% using the tactile patterns generated by the CNN.
Abstract:This paper presents the concept of Industry 6.0, introducing the world's first fully automated production system that autonomously handles the entire product design and manufacturing process based on user-provided natural language descriptions. By leveraging generative AI, the system automates critical aspects of production, including product blueprint design, component manufacturing, logistics, and assembly. A heterogeneous swarm of robots, each equipped with individual AI through integration with Large Language Models (LLMs), orchestrates the production process. The robotic system includes manipulator arms, delivery drones, and 3D printers capable of generating assembly blueprints. The system was evaluated using commercial and open-source LLMs, functioning through APIs and local deployment. A user study demonstrated that the system reduces the average production time to 119.10 minutes, significantly outperforming a team of expert human developers, who averaged 528.64 minutes (an improvement factor of 4.4). Furthermore, in the product blueprinting stage, the system surpassed human CAD operators by an unprecedented factor of 47, completing the task in 0.5 minutes compared to 23.5 minutes. This breakthrough represents a major leap towards fully autonomous manufacturing.
Abstract:This paper introduces a system of data collection acceleration and real-to-sim transferring for surface recognition on a quadruped robot. The system features a mechanical single-leg setup capable of stepping on various easily interchangeable surfaces. Additionally, it incorporates a GRU-based Surface Recognition System, inspired by the system detailed in the Dog-Surf paper. This setup facilitates the expansion of dataset collection for model training, enabling data acquisition from hard-to-reach surfaces in laboratory conditions. Furthermore, it opens avenues for transferring surface properties from reality to simulation, thereby allowing the training of optimal gaits for legged robots in simulation environments using a pre-prepared library of digital twins of surfaces. Moreover, enhancements have been made to the GRU-based Surface Recognition System, allowing for the integration of data from both the quadruped robot and the single-leg setup. The dataset and code have been made publicly available.
Abstract:This paper presents the OmniRace approach to controlling a racing drone with 6-degree of freedom (DoF) hand pose estimation and gesture recognition. To our knowledge, it is the first-ever technology that allows for low-level control of high-speed drones using gestures. OmniRace employs a gesture interface based on computer vision and a deep neural network to estimate a 6-DoF hand pose. The advanced machine learning algorithm robustly interprets human gestures, allowing users to control drone motion intuitively. Real-time control of a racing drone demonstrates the effectiveness of the system, validating its potential to revolutionize drone racing and other applications. Experimental results conducted in the Gazebo simulation environment revealed that OmniRace allows the users to complite the UAV race track significantly (by 25.1%) faster and to decrease the length of the test drone path (from 102.9 to 83.7 m). Users preferred the gesture interface for attractiveness (1.57 UEQ score), hedonic quality (1.56 UEQ score), and lower perceived temporal demand (32.0 score in NASA-TLX), while noting the high efficiency (0.75 UEQ score) and low physical demand (19.0 score in NASA-TLX) of the baseline remote controller. The deep neural network attains an average accuracy of 99.75% when applied to both normalized datasets and raw datasets. OmniRace can potentially change the way humans interact with and navigate racing drones in dynamic and complex environments. The source code is available at https://github.com/SerValera/OmniRace.git.
Abstract:Autonomous drone navigation faces a critical challenge in achieving accurate landings on dynamic platforms, especially under unpredictable conditions such as wind turbulence. Our research introduces TornadoDrone, a novel Deep Reinforcement Learning (DRL) model that adopts bio-inspired mechanisms to adapt to wind forces, mirroring the natural adaptability seen in birds. This model, unlike traditional approaches, derives its adaptability from indirect cues such as changes in position and velocity, rather than direct wind force measurements. TornadoDrone was rigorously trained in the gym-pybullet-drone simulator, which closely replicates the complexities of wind dynamics in the real world. Through extensive testing with Crazyflie 2.1 drones in both simulated and real windy conditions, TornadoDrone demonstrated a high performance in maintaining high-precision landing accuracy on moving platforms, surpassing conventional control methods such as PID controllers with Extended Kalman Filters. The study not only highlights the potential of DRL to tackle complex aerodynamic challenges but also paves the way for advanced autonomous systems that can adapt to environmental changes in real-time. The success of TornadoDrone signifies a leap forward in drone technology, particularly for critical applications such as surveillance and emergency response, where reliability and precision are paramount.
Abstract:Achieving safe and precise landings for a swarm of drones poses a significant challenge, primarily attributed to conventional control and planning methods. This paper presents the implementation of multi-agent deep reinforcement learning (MADRL) techniques for the precise landing of a drone swarm at relocated target locations. The system is trained in a realistic simulated environment with a maximum velocity of 3 m/s in training spaces of 4 x 4 x 4 m and deployed utilizing Crazyflie drones with a Vicon indoor localization system. The experimental results revealed that the proposed approach achieved a landing accuracy of 2.26 cm on stationary and 3.93 cm on moving platforms surpassing a baseline method used with a Proportional-integral-derivative (PID) controller with an Artificial Potential Field (APF). This research highlights drone landing technologies that eliminate the need for analytical centralized systems, potentially offering scalability and revolutionizing applications in logistics, safety, and rescue missions.
Abstract:This article presents the world's first rapid drone flocking control using natural language through generative AI. The described approach enables the intuitive orchestration of a flock of any size to achieve the desired geometry. The key feature of the method is the development of a new interface based on Large Language Models to communicate with the user and to generate the target geometry descriptions. Users can interactively modify or provide comments during the construction of the flock geometry model. By combining flocking technology and defining the target surface using a signed distance function, smooth and adaptive movement of the drone swarm between target states is achieved. Our user study on FlockGPT confirmed a high level of intuitive control over drone flocking by users. Subjects who had never previously controlled a swarm of drones were able to construct complex figures in just a few iterations and were able to accurately distinguish the formed swarm drone figures. The results revealed a high recognition rate for six different geometric patterns generated through the LLM-based interface and performed by a simulated drone flock (mean of 80% with a maximum of 93\% for cube and tetrahedron patterns). Users commented on low temporal demand (19.2 score in NASA-TLX), high performance (26 score in NASA-TLX), attractiveness (1.94 UEQ score), and hedonic quality (1.81 UEQ score) of the developed system. The FlockGPT demo code repository can be found at: coming soon