Abstract:Autonomous navigation in unstructured outdoor environments is inherently challenging due to the presence of asymmetric traversal costs, such as varying energy expenditures for uphill versus downhill movement. Traditional reinforcement learning methods often assume symmetric costs, which can lead to suboptimal navigation paths and increased safety risks in real-world scenarios. In this paper, we introduce QuasiNav, a novel reinforcement learning framework that integrates quasimetric embeddings to explicitly model asymmetric costs and guide efficient, safe navigation. QuasiNav formulates the navigation problem as a constrained Markov decision process (CMDP) and employs quasimetric embeddings to capture directionally dependent costs, allowing for a more accurate representation of the terrain. This approach is combined with adaptive constraint tightening within a constrained policy optimization framework to dynamically enforce safety constraints during learning. We validate QuasiNav across three challenging navigation scenarios-undulating terrains, asymmetric hill traversal, and directionally dependent terrain traversal-demonstrating its effectiveness in both simulated and real-world environments. Experimental results show that QuasiNav significantly outperforms conventional methods, achieving higher success rates, improved energy efficiency, and better adherence to safety constraints.
Abstract:The increasing deployment of autonomous systems in complex environments necessitates efficient communication and task completion among multiple agents. This paper presents SERN (Simulation-Enhanced Realistic Navigation), a novel framework integrating virtual and physical environments for real-time collaborative decision-making in multi-robot systems. SERN addresses key challenges in asset deployment and coordination through a bi-directional communication framework using the AuroraXR ROS Bridge. Our approach advances the SOTA through accurate real-world representation in virtual environments using Unity high-fidelity simulator; synchronization of physical and virtual robot movements; efficient ROS data distribution between remote locations; and integration of SOTA semantic segmentation for enhanced environmental perception. Our evaluations show a 15% to 24% improvement in latency and up to a 15% increase in processing efficiency compared to traditional ROS setups. Real-world and virtual simulation experiments with multiple robots demonstrate synchronization accuracy, achieving less than 5 cm positional error and under 2-degree rotational error. These results highlight SERN's potential to enhance situational awareness and multi-agent coordination in diverse, contested environments.
Abstract:Cover navigation in complex environments is a critical challenge for autonomous robots, requiring the identification and utilization of environmental cover while maintaining efficient navigation. We propose an enhanced navigation system that enables robots to identify and utilize natural and artificial environmental features as cover, thereby minimizing exposure to potential threats. Our perception pipeline leverages LiDAR data to generate high-fidelity cover maps and potential threat maps, providing a comprehensive understanding of the surrounding environment. We train an offline reinforcement learning model using a diverse dataset collected from real-world environments, learning a robust policy that evaluates the quality of candidate actions based on their ability to maximize cover utilization, minimize exposure to threats, and reach the goal efficiently. Extensive real-world experiments demonstrate the superiority of our approach in terms of success rate, cover utilization, exposure minimization, and navigation efficiency compared to state-of-the-art methods.
Abstract:Autonomous robots exploring unknown areas face a significant challenge -- navigating effectively without prior maps and with limited external feedback. This challenge intensifies in sparse reward environments, where traditional exploration techniques often fail. In this paper, we introduce TopoNav, a novel framework that empowers robots to overcome these constraints and achieve efficient, adaptable, and goal-oriented exploration. TopoNav's fundamental building blocks are active topological mapping, intrinsic reward mechanisms, and hierarchical objective prioritization. Throughout its exploration, TopoNav constructs a dynamic topological map that captures key locations and pathways. It utilizes intrinsic rewards to guide the robot towards designated sub-goals within this map, fostering structured exploration even in sparse reward settings. To ensure efficient navigation, TopoNav employs the Hierarchical Objective-Driven Active Topologies framework, enabling the robot to prioritize immediate tasks like obstacle avoidance while maintaining focus on the overall goal. We demonstrate TopoNav's effectiveness in simulated environments that replicate real-world conditions. Our results reveal significant improvements in exploration efficiency, navigational accuracy, and adaptability to unforeseen obstacles, showcasing its potential to revolutionize autonomous exploration in a wide range of applications, including search and rescue, environmental monitoring, and planetary exploration.
Abstract:This study presents a comparative analysis between single-objective and multi-objective reinforcement learning methods for training a robot to navigate effectively to an end goal while efficiently avoiding obstacles. Traditional reinforcement learning techniques, namely Deep Q-Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Twin Delayed DDPG (TD3), have been evaluated using the Gazebo simulation framework in a variety of environments with parameters such as random goal and robot starting locations. These methods provide a numerical reward to the robot, offering an indication of action quality in relation to the goal. However, their limitations become apparent in complex settings where multiple, potentially conflicting, objectives are present. To address these limitations, we propose an approach employing Multi-Objective Reinforcement Learning (MORL). By modifying the reward function to return a vector of rewards, each pertaining to a distinct objective, the robot learns a policy that effectively balances the different goals, aiming to achieve a Pareto optimal solution. This comparative study highlights the potential for MORL in complex, dynamic robotic navigation tasks, setting the stage for future investigations into more adaptable and robust robotic behaviors.
Abstract:Autonomous navigation in offroad environments has been extensively studied in the robotics field. However, navigation in covert situations where an autonomous vehicle needs to remain hidden from outside observers remains an underexplored area. In this paper, we propose a novel Deep Reinforcement Learning (DRL) based algorithm, called CoverNav, for identifying covert and navigable trajectories with minimal cost in offroad terrains and jungle environments in the presence of observers. CoverNav focuses on unmanned ground vehicles seeking shelters and taking covers while safely navigating to a predefined destination. Our proposed DRL method computes a local cost map that helps distinguish which path will grant the maximal covertness while maintaining a low cost trajectory using an elevation map generated from 3D point cloud data, the robot's pose, and directed goal information. CoverNav helps robot agents to learn the low elevation terrain using a reward function while penalizing it proportionately when it experiences high elevation. If an observer is spotted, CoverNav enables the robot to select natural obstacles (e.g., rocks, houses, disabled vehicles, trees, etc.) and use them as shelters to hide behind. We evaluate CoverNav using the Unity simulation environment and show that it guarantees dynamically feasible velocities in the terrain when fed with an elevation map generated by another DRL based navigation algorithm. Additionally, we evaluate CoverNav's effectiveness in achieving a maximum goal distance of 12 meters and its success rate in different elevation scenarios with and without cover objects. We observe competitive performance comparable to state of the art (SOTA) methods without compromising accuracy.
Abstract:Nowadays, autonomous cars are gaining traction due to their numerous potential applications on battlefields and in resolving a variety of other real-world challenges. The main goal of our project is to build an autonomous system using DeepRacer which will follow a specific person (for our project, a soldier) when they will be moving in any direction. Two main components to accomplish this project is an optimized Single-Shot Multibox Detection (SSD) object detection model and a Reinforcement Learning (RL) model. We accomplished the task using SSD Lite instead of SSD and at the end, compared the results among SSD, SSD with Neural Computing Stick (NCS), and SSD Lite. Experimental results show that SSD Lite gives better performance among these three techniques and exhibits a considerable boost in inference speed (~2-3 times) without compromising accuracy.
Abstract:Nowadays, autonomous vehicles are gaining traction due to their numerous potential applications in resolving a variety of other real-world challenges. However, developing autonomous vehicles need huge amount of training and testing before deploying it to real world. While the field of reinforcement learning (RL) has evolved into a powerful learning framework to the development of deep representation learning, and it is now capable of learning complicated policies in high-dimensional environments like in autonomous vehicles. In this regard, we make an effort, using Deep Q-Learning, to discover a method by which an autonomous car may maintain its lane at top speed while avoiding other vehicles. After that, we used CARLA simulation environment to test and verify our newly acquired policy based on the problem formulation.
Abstract:There have thousands of crimes are happening daily all around. But people keep statistics only few of them, therefore crime rates are increasing day by day. The reason behind can be less concern or less statistics of previous crimes. It is much more important to observe the previous crime statistics for general people to make their outing decision and police for catching the criminals are taking steps to restrain the crimes and tourists to make their travelling decision. National institute of justice releases crime survey data for the country, but does not offer crime statistics up to Union or Thana level. Considering all of these cases we have come up with an approach which can give an approximation to people about the safety of a specific location with crime ranking of different areas locating the crimes on a map including a future crime occurrence prediction mechanism. Our approach relies on different online Bangla newspapers for crawling the crime data, stemming and keyword extraction, location finding algorithm, cosine similarity, naive Bayes classifier, and a custom crime prediction model
Abstract:With the advancement of modern robotics, autonomous agents are now capable of hosting sophisticated algorithms, which enables them to make intelligent decisions. But developing and testing such algorithms directly in real-world systems is tedious and may result in the wastage of valuable resources. Especially for heterogeneous multi-agent systems in battlefield environments where communication is critical in determining the system's behavior and usability. Due to the necessity of simulators of separate paradigms (co-simulation) to simulate such scenarios before deploying, synchronization between those simulators is vital. Existing works aimed at resolving this issue fall short of addressing diversity among deployed agents. In this work, we propose \textit{SynchroSim}, an integrated co-simulation middleware to simulate a heterogeneous multi-robot system. Here we propose a velocity difference-driven adjustable window size approach with a view to reducing packet loss probability. It takes into account the respective velocities of deployed agents to calculate a suitable window size before transmitting data between them. We consider our algorithm-specific simulator agnostic but for the sake of implementation results, we have used Gazebo as a Physics simulator and NS-3 as a network simulator. Also, we design our algorithm considering the Perception-Action loop inside a closed communication channel, which is one of the essential factors in a contested scenario with the requirement of high fidelity in terms of data transmission. We validate our approach empirically at both the simulation and system level for both line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios. Our approach achieves a noticeable improvement in terms of reducing packet loss probability ($\approx$11\%), and average packet delay ($\approx$10\%) compared to the fixed window size-based synchronization approach.