Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mihir Kulkarni

Semantically-driven Deep Reinforcement Learning for Inspection Path Planning

May 20, 2025

Grzegorz Malczyk, Mihir Kulkarni, Kostas Alexis

Abstract:This paper introduces a novel semantics-aware inspection planning policy derived through deep reinforcement learning. Reflecting the fact that within autonomous informative path planning missions in unknown environments, it is often only a sparse set of objects of interest that need to be inspected, the method contributes an end-to-end policy that simultaneously performs semantic object visual inspection combined with collision-free navigation. Assuming access only to the instantaneous depth map, the associated segmentation image, the ego-centric local occupancy, and the history of past positions in the robot's neighborhood, the method demonstrates robust generalizability and successful crossing of the sim2real gap. Beyond simulations and extensive comparison studies, the approach is verified in experimental evaluations onboard a flying robot deployed in novel environments with previously unseen semantics and overall geometric configurations.

* Accepted for publication in IEEE Robotics and Automation Letters (RA-L)

Via

Access Paper or Ask Questions

A Neural Network Mode for PX4 on Embedded Flight Controllers

May 01, 2025

Sindre M. Hegre, Welf Rehberg, Mihir Kulkarni, Kostas Alexis

Abstract:This paper contributes an open-sourced implementation of a neural-network based controller framework within the PX4 stack. We develop a custom module for inference on the microcontroller while retaining all of the functionality of the PX4 autopilot. Policies trained in the Aerial Gym Simulator are converted to the TensorFlow Lite format and then built together with PX4 and flashed to the flight controller. The policies substitute the control-cascade within PX4 to offer an end-to-end position-setpoint tracking controller directly providing normalized motor RPM setpoints. Experiments conducted in simulation and the real-world show similar tracking performance. We thus provide a flight-ready pipeline for testing neural control policies in the real world. The pipeline simplifies the deployment of neural networks on embedded flight controller hardware thereby accelerating research on learning-based control. Both the Aerial Gym Simulator and the PX4 module are open-sourced at https://github.com/ntnu-arl/aerial_gym_simulator and https://github.com/SindreMHegre/PX4-Autopilot-public/tree/for_paper. Video: https://youtu.be/lY1OKz_UOqM?si=VtzL243BAY3lblTJ.

* 4 pages. Accepted to the Workshop on 25 Years of Aerial Robotics: Challenges and Opportunities (ICRA 2025)

Via

Access Paper or Ask Questions

MapQA: Open-domain Geospatial Question Answering on Map Data

Mar 10, 2025

Zekun Li, Malcolm Grossman, Eric, Qasemi, Mihir Kulkarni, Muhao Chen, Yao-Yi Chiang

Abstract:Geospatial question answering (QA) is a fundamental task in navigation and point of interest (POI) searches. While existing geospatial QA datasets exist, they are limited in both scale and diversity, often relying solely on textual descriptions of geo-entities without considering their geometries. A major challenge in scaling geospatial QA datasets for reasoning lies in the complexity of geospatial relationships, which require integrating spatial structures, topological dependencies, and multi-hop reasoning capabilities that most text-based QA datasets lack. To address these limitations, we introduce MapQA, a novel dataset that not only provides question-answer pairs but also includes the geometries of geo-entities referenced in the questions. MapQA is constructed using SQL query templates to extract question-answer pairs from OpenStreetMap (OSM) for two study regions: Southern California and Illinois. It consists of 3,154 QA pairs spanning nine question types that require geospatial reasoning, such as neighborhood inference and geo-entity type identification. Compared to existing datasets, MapQA expands both the number and diversity of geospatial question types. We explore two approaches to tackle this challenge: (1) a retrieval-based language model that ranks candidate geo-entities by embedding similarity, and (2) a large language model (LLM) that generates SQL queries from natural language questions and geo-entity attributes, which are then executed against an OSM database. Our findings indicate that retrieval-based methods effectively capture concepts like closeness and direction but struggle with questions that require explicit computations (e.g., distance calculations). LLMs (e.g., GPT and Gemini) excel at generating SQL queries for one-hop reasoning but face challenges with multi-hop reasoning, highlighting a key bottleneck in advancing geospatial QA systems.

Via

Access Paper or Ask Questions

Aerial Gym Simulator: A Framework for Highly Parallelized Simulation of Aerial Robots

Mar 03, 2025

Mihir Kulkarni, Welf Rehberg, Kostas Alexis

Abstract:This paper contributes the Aerial Gym Simulator, a highly parallelized, modular framework for simulation and rendering of arbitrary multirotor platforms based on NVIDIA Isaac Gym. Aerial Gym supports the simulation of under-, fully- and over-actuated multirotors offering parallelized geometric controllers, alongside a custom GPU-accelerated rendering framework for ray-casting capable of capturing depth, segmentation and vertex-level annotations from the environment. Multiple examples for key tasks, such as depth-based navigation through reinforcement learning are provided. The comprehensive set of tools developed within the framework makes it a powerful resource for research on learning for control, planning, and navigation using state information as well as exteroceptive sensor observations. Extensive simulation studies are conducted and successful sim2real transfer of trained policies is demonstrated. The Aerial Gym Simulator is open-sourced at: https://github.com/ntnu-arl/aerial_gym_simulator.

* Accepted for publication in IEEE Robotics and Automation Letters (RA-L)

Via

Access Paper or Ask Questions

Neural Control Barrier Functions for Safe Navigation

Jul 29, 2024

Marvin Harms, Mihir Kulkarni, Nikhil Khedekar, Martin Jacquet, Kostas Alexis

Abstract:Autonomous robot navigation can be particularly demanding, especially when the surrounding environment is not known and safety of the robot is crucial. This work relates to the synthesis of Control Barrier Functions (CBFs) through data for safe navigation in unknown environments. A novel methodology to jointly learn CBFs and corresponding safe controllers, in simulation, inspired by the State Dependent Riccati Equation (SDRE) is proposed. The CBF is used to obtain admissible commands from any nominal, possibly unsafe controller. An approach to apply the CBF inside a safety filter without the need for a consistent map or position estimate is developed. Subsequently, the resulting reactive safety filter is deployed on a multirotor platform integrating a LiDAR sensor both in simulation and real-world experiments.

* Accepted for presentation at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024

Via

Access Paper or Ask Questions

Maritime Vessel Tank Inspection using Aerial Robots: Experience from the field and dataset release

Apr 29, 2024

Mihir Dharmadhikari, Nikhil Khedekar, Paolo De Petris, Mihir Kulkarni, Morten Nissov, Kostas Alexis

Abstract:This paper presents field results and lessons learned from the deployment of aerial robots inside ship ballast tanks. Vessel tanks including ballast tanks and cargo holds present dark, dusty environments having simultaneously very narrow openings and wide open spaces that create several challenges for autonomous navigation and inspection operations. We present a system for vessel tank inspection using an aerial robot along with its autonomy modules. We show the results of autonomous exploration and visual inspection in 3 ships spanning across 7 distinct types of sections of the ballast tanks. Additionally, we comment on the lessons learned from the field and possible directions for future work. Finally, we release a dataset consisting of the data from these missions along with data collected with a handheld sensor stick.

* Accepted to the IEEE ICRA Workshop on Field Robotics 2024

Via

Access Paper or Ask Questions

Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding

Feb 06, 2024

Mihir Kulkarni, Kostas Alexis

Figure 1 for Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding

Figure 2 for Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding

Figure 3 for Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding

Figure 4 for Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding

Abstract:This work contributes a novel deep navigation policy that enables collision-free flight of aerial robots based on a modular approach exploiting deep collision encoding and reinforcement learning. The proposed solution builds upon a deep collision encoder that is trained on both simulated and real depth images using supervised learning such that it compresses the high-dimensional depth data to a low-dimensional latent space encoding collision information while accounting for the robot size. This compressed encoding is combined with an estimate of the robot's odometry and the desired target location to train a deep reinforcement learning navigation policy that offers low-latency computation and robust sim2real performance. A set of simulation and experimental studies in diverse environments are conducted and demonstrate the efficiency of the emerged behavior and its resilience in real-life deployments.

* 8 pages, 8 figures. Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2024

Via

Access Paper or Ask Questions

Aerial Field Robotics

Jan 19, 2024

Mihir Kulkarni, Brady Moon, Kostas Alexis, Sebastian Scherer

Abstract:Aerial field robotics research represents the domain of study that aims to equip unmanned aerial vehicles - and as it pertains to this chapter, specifically Micro Aerial Vehicles (MAVs)- with the ability to operate in real-life environments that present challenges to safe navigation. We present the key elements of autonomy for MAVs that are resilient to collisions and sensing degradation, while operating under constrained computational resources. We overview aspects of the state of the art, outline bottlenecks to resilient navigation autonomy, and overview the field-readiness of MAVs. We conclude with notable contributions and discuss considerations for future research that are essential for resilience in aerial robotics.

* Accepted in the Encyclopedia of Robotics, Springer

Via

Access Paper or Ask Questions

Autonomous Exploration and General Visual Inspection of Ship Ballast Water Tanks using Aerial Robots

Nov 07, 2023

Mihir Dharmadhikari, Paolo De Petris, Mihir Kulkarni, Nikhil Khedekar, Huan Nguyen, Arnt Erik Stene, Eivind Sjøvold, Kristian Solheim, Bente Gussiaas, Kostas Alexis

Figure 1 for Autonomous Exploration and General Visual Inspection of Ship Ballast Water Tanks using Aerial Robots

Figure 2 for Autonomous Exploration and General Visual Inspection of Ship Ballast Water Tanks using Aerial Robots

Figure 3 for Autonomous Exploration and General Visual Inspection of Ship Ballast Water Tanks using Aerial Robots

Figure 4 for Autonomous Exploration and General Visual Inspection of Ship Ballast Water Tanks using Aerial Robots

Abstract:This paper presents a solution for the autonomous exploration and inspection of Ballast Water Tanks (BWTs) of marine vessels using aerial robots. Ballast tank compartments are critical for a vessel's safety and correspond to confined environments often connected through particularly narrow manholes. The method enables their volumetric exploration combined with visual inspection subject to constraints regarding the viewing distance from a surface. We present evaluation studies in simulation, in a mission consisting of 18 BWT compartments, and in 3 field experiments inside real vessels. The data from one of the experiments is also post-processed to generate semantically-segmented meshes of inspection-important geometries. Geometric models can be associated with onboard camera images for detailed and intuitive analysis.

* 8 pages, 7 figures, accepted for publication at the 2023 IEEE International Conference on Advanced Robotics (ICAR)

Via

Access Paper or Ask Questions

Task-driven Compression for Collision Encoding based on Depth Images

Sep 11, 2023

Mihir Kulkarni, Kostas Alexis

Abstract:This paper contributes a novel learning-based method for aggressive task-driven compression of depth images and their encoding as images tailored to collision prediction for robotic systems. A novel 3D image processing methodology is proposed that accounts for the robot's size in order to appropriately "inflate" the obstacles represented in the depth image and thus obtain the distance that can be traversed by the robot in a collision-free manner along any given ray within the camera frustum. Such depth-and-collision image pairs are used to train a neural network that follows the architecture of Variational Autoencoders to compress-and-transform the information in the original depth image to derive a latent representation that encodes the collision information for the given depth image. We compare our proposed task-driven encoding method with classical task-agnostic methods and demonstrate superior performance for the task of collision image prediction from extremely low-dimensional latent spaces. A set of comparative studies show that the proposed approach is capable of encoding depth image-and-collision image tuples from complex scenes with thin obstacles at long distances better than the classical methods at compression ratios as high as 4050:1.

* 14 pages, 5, figures. Accepted to the International Symposium on Visual Computing 2023

Via

Access Paper or Ask Questions