Abstract:Unprecedented agility and dexterous manipulation have been demonstrated with controllers based on deep reinforcement learning (RL), with a significant impact on legged and humanoid robots. Modern tooling and simulation platforms, such as NVIDIA Isaac Sim, have been enabling such advances. This article focuses on demonstrating the applications of Isaac in local planning and obstacle avoidance as one of the most fundamental ways in which a mobile robot interacts with its environments. Although there is extensive research on proprioception-based RL policies, the article highlights less standardized and reproducible approaches to exteroception. At the same time, the article aims to provide a base framework for end-to-end local navigation policies and how a custom robot can be trained in such simulation environment. We benchmark end-to-end policies with the state-of-the-art Nav2, navigation stack in Robot Operating System (ROS). We also cover the sim-to-real transfer process by demonstrating zero-shot transferability of policies trained in the Isaac simulator to real-world robots. This is further evidenced by the tests with different simulated robots, which show the generalization of the learned policy. Finally, the benchmarks demonstrate comparable performance to Nav2, opening the door to quick deployment of state-of-the-art end-to-end local planners for custom robot platforms, but importantly furthering the possibilities by expanding the state and action spaces or task definitions for more complex missions. Overall, with this article we introduce the most important steps, and aspects to consider, in deploying RL policies for local path planning and obstacle avoidance with Isaac Sim training, Gazebo testing, and ROS 2 for real-time inference in real robots. The code is available at https://github.com/sahars93/RL-Navigation.
Abstract:The objective of this paper is to enhance the optimization process for neural networks by developing a dynamic learning rate algorithm that effectively integrates exponential decay and advanced anti-overfitting strategies. Our primary contribution is the establishment of a theoretical framework where we demonstrate that the optimization landscape, under the influence of our algorithm, exhibits unique stability characteristics defined by Lyapunov stability principles. Specifically, we prove that the superlevel sets of the loss function, as influenced by our adaptive learning rate, are always connected, ensuring consistent training dynamics. Furthermore, we establish the "equiconnectedness" property of these superlevel sets, which maintains uniform stability across varying training conditions and epochs. This paper contributes to the theoretical understanding of dynamic learning rate mechanisms in neural networks and also pave the way for the development of more efficient and reliable neural optimization techniques. This study intends to formalize and validate the equiconnectedness of loss function as superlevel sets in the context of neural network training, opening newer avenues for future research in adaptive machine learning algorithms. We leverage previous theoretical discoveries to propose training mechanisms that can effectively handle complex and high-dimensional data landscapes, particularly in applications requiring high precision and reliability.
Abstract:Background: The reproducibility of machine-learning models in prostate cancer detection across different MRI vendors remains a significant challenge. Methods: This study investigates Support Vector Machines (SVM) and Random Forest (RF) models trained on radiomic features extracted from T2-weighted MRI images using Pyradiomics and MRCradiomics libraries. Feature selection was performed using the maximum relevance minimum redundancy (MRMR) technique. We aimed to enhance clinical decision support through multimodal learning and feature fusion. Results: Our SVM model, utilizing combined features from Pyradiomics and MRCradiomics, achieved an AUC of 0.74 on the Multi-Improd dataset (Siemens scanner) but decreased to 0.60 on the Philips test set. The RF model showed similar trends, with notable robustness for models using Pyradiomics features alone (AUC of 0.78 on Philips). Conclusions: These findings demonstrate the potential of multimodal feature integration to improve the robustness and generalizability of machine-learning models for clinical decision support in prostate cancer detection. This study marks a significant step towards developing reliable AI-driven diagnostic tools that maintain efficacy across various imaging platforms.
Abstract:Autonomous systems are becoming inherently ubiquitous with the advancements of computing and communication solutions enabling low-latency offloading and real-time collaboration of distributed devices. Decentralized technologies with blockchain and distributed ledger technologies (DLTs) are playing a key role. At the same time, advances in deep learning (DL) have significantly raised the degree of autonomy and level of intelligence of robotic and autonomous systems. While these technological revolutions were taking place, raising concerns in terms of data security and end-user privacy has become an inescapable research consideration. Federated learning (FL) is a promising solution to privacy-preserving DL at the edge, with an inherently distributed nature by learning on isolated data islands and communicating only model updates. However, FL by itself does not provide the levels of security and robustness required by today's standards in distributed autonomous systems. This survey covers applications of FL to autonomous robots, analyzes the role of DLT and FL for these systems, and introduces the key background concepts and considerations in current research.
Abstract:Unmanned aerial vehicles (UAVs) are becoming largely ubiquitous with an increasing demand for aerial data. Accurate navigation and localization, required for precise data collection in many industrial applications, often relies on RTK GNSS. These systems, able of centimeter-level accuracy, require a setup and calibration process and are relatively expensive. This paper addresses the problem of accurate positioning and navigation of UAVs through cooperative localization. Inexpensive ultra-wideband (UWB) transceivers installed on both the UAV and a support ground robot enable centimeter-level relative positioning. With fast deployment and wide setup flexibility, the proposed system is able to accommodate different environments and can also be utilized in GNSS-denied environments. Through extensive simulations and test fields, we evaluate the accuracy of the system and compare it to GNSS in urban environments where multipath transmission degrades accuracy. For completeness, we include visual-inertial odometry in the experiments and compare the performance with the UWB-based cooperative localization.
Abstract:Ultra-wideband (UWB) technology is a mature technology that contested other wireless technologies in the advent of the IoT but did not achieve the same levels of widespread adoption. In recent years, however, with its potential as a wireless ranging and localization solution, it has regained momentum. Within the robotics field, UWB positioning systems are being increasingly adopted for localizing autonomous ground or aerial robots. In the Industrial IoT (IIoT) domain, its potential for ad-hoc networking and simultaneous positioning is also being explored. This survey overviews the state-of-the-art in UWB networking and localization for robotic and autonomous systems. We also cover novel techniques focusing on more scalable systems, collaborative approaches to localization, ad-hoc networking, and solutions involving machine learning to improve accuracy. This is, to the best of our knowledge, the first survey to put together the robotics and IIoT perspectives and to emphasize novel ranging and positioning modalities. We complete the survey with a discussion on current trends and open research problems.
Abstract:Vehicles with prolonged autonomous missions have to maintain environment awareness by simultaneous localization and mapping (SLAM). Closed loop correction is substituted by interpolation in rigid body transformation space in order to systematically reduce the accumulated error over different scales. The computation is divided to an edge computed lightweight SLAM and iterative corrections in the cloud environment. Tree locations in the forest environment are sent via a potentially limited communication bandwidths. Data from a real forest site is used in the verification of the proposed algorithm. The algorithm adds new iterative closest point (ICP) cases to the initial SLAM and measures the resulting map quality by the mean of the root mean squared error (RMSE) of individual tree clusters. Adding 4% more match cases yields the mean RMSE 0.15 m on a large site with 180 m odometric distance.
Abstract:Event cameras, i.e., the Dynamic and Active-pixel Vision Sensor (DAVIS) ones, capture the intensity changes in the scene and generates a stream of events in an asynchronous fashion. The output rate of such cameras can reach up to 10 million events per second in high dynamic environments. DAVIS cameras use novel vision sensors that mimic human eyes. Their attractive attributes, such as high output rate, High Dynamic Range (HDR), and high pixel bandwidth, make them an ideal solution for applications that require high-frequency tracking. Moreover, applications that operate in challenging lighting scenarios can exploit the high HDR of event cameras, i.e., 140 dB compared to 60 dB of traditional cameras. In this paper, a novel asynchronous corner tracking method is proposed that uses both events and intensity images captured by a DAVIS camera. The Harris algorithm is used to extract features, i.e., frame-corners from keyframes, i.e., intensity images. Afterward, a matching algorithm is used to extract event-corners from the stream of events. Events are solely used to perform asynchronous tracking until the next keyframe is captured. Neighboring events, within a window size of 5x5 pixels around the event-corner, are used to calculate the velocity and direction of extracted event-corners by fitting the 2D planar using a randomized Hough transform algorithm. Experimental evaluation showed that our approach is able to update the location of the extracted corners up to 100 times during the blind time of traditional cameras, i.e., between two consecutive intensity images.
Abstract:Moving towards autonomy, unmanned vehicles rely heavily on state-of-the-art collision avoidance systems (CAS). However, the detection of obstacles especially during night-time is still a challenging task since the lighting conditions are not sufficient for traditional cameras to function properly. Therefore, we exploit the powerful attributes of event-based cameras to perform obstacle detection in low lighting conditions. Event cameras trigger events asynchronously at high output temporal rate with high dynamic range of up to 120 $dB$. The algorithm filters background activity noise and extracts objects using robust Hough transform technique. The depth of each detected object is computed by triangulating 2D features extracted utilising LC-Harris. Finally, asynchronous adaptive collision avoidance (AACA) algorithm is applied for effective avoidance. Qualitative evaluation is compared using event-camera and traditional camera.
Abstract:Event-based cameras are vision devices that transmit only brightness changes with low latency and ultra-low power consumption. Such characteristics make event-based cameras attractive in the field of localization and object tracking in resource-constrained systems. Since the number of generated events in such cameras is huge, the selection and filtering of the incoming events are beneficial from both increasing the accuracy of the features and reducing the computational load. In this paper, we present an algorithm to detect asynchronous corners from a stream of events in real-time on embedded systems. The algorithm is called the Three Layer Filtering-Harris or TLF-Harris algorithm. The algorithm is based on an events' filtering strategy whose purpose is 1) to increase the accuracy by deliberately eliminating some incoming events, i.e., noise, and 2) to improve the real-time performance of the system, i.e., preserving a constant throughput in terms of input events per second, by discarding unnecessary events with a limited accuracy loss. An approximation of the Harris algorithm, in turn, is used to exploit its high-quality detection capability with a low-complexity implementation to enable seamless real-time performance on embedded computing platforms. The proposed algorithm is capable of selecting the best corner candidate among neighbors and achieves an average execution time savings of 59 % compared with the conventional Harris score. Moreover, our approach outperforms the competing methods, such as eFAST, eHarris, and FA-Harris, in terms of real-time performance, and surpasses Arc* in terms of accuracy.