Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yogesh Girdhar

Applied Ocean Physics and Engineering Department, Woods Hole Oceanographic Institution

Measuring and Minimizing Disturbance of Marine Animals to Underwater Vehicles

Jun 12, 2025

Levi Cai, Youenn Jézéquel, T. Aran Mooney, Yogesh Girdhar

Abstract:Do fish respond to the presence of underwater vehicles, potentially biasing our estimates about them? If so, are there strategies to measure and mitigate this response? This work provides a theoretical and practical framework towards bias-free estimation of animal behavior from underwater vehicle observations. We also provide preliminary results from the field in coral reef environments to address these questions.

* Accepted to ISER 2025

Via

Access Paper or Ask Questions

DataS^3: Dataset Subset Selection for Specialization

Apr 22, 2025

Neha Hulkund, Alaa Maalouf, Levi Cai, Daniel Yang, Tsun-Hsuan Wang, Abigail O'Neil, Timm Haucke, Sandeep Mukherjee, Vikram Ramaswamy, Judy Hansen Shen(+8 more)

Abstract:In many real-world machine learning (ML) applications (e.g. detecting broken bones in x-ray images, detecting species in camera traps), in practice models need to perform well on specific deployments (e.g. a specific hospital, a specific national park) rather than the domain broadly. However, deployments often have imbalanced, unique data distributions. Discrepancy between the training distribution and the deployment distribution can lead to suboptimal performance, highlighting the need to select deployment-specialized subsets from the available training data. We formalize dataset subset selection for specialization (DS3): given a training set drawn from a general distribution and a (potentially unlabeled) query set drawn from the desired deployment-specific distribution, the goal is to select a subset of the training data that optimizes deployment performance. We introduce DataS^3; the first dataset and benchmark designed specifically for the DS3 problem. DataS^3 encompasses diverse real-world application domains, each with a set of distinct deployments to specialize in. We conduct a comprehensive study evaluating algorithms from various families--including coresets, data filtering, and data curation--on DataS^3, and find that general-distribution methods consistently fail on deployment-specific tasks. Additionally, we demonstrate the existence of manually curated (deployment-specific) expert subsets that outperform training on all available data with accuracy gains up to 51.3 percent. Our benchmark highlights the critical role of tailored dataset curation in enhancing performance and training efficiency on deployment-specific distributions, which we posit will only become more important as global, public datasets become available across domains and ML models are deployed in the real world.

Via

Access Paper or Ask Questions

SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model

Sep 25, 2024

Daniel Yang, John J. Leonard, Yogesh Girdhar

Abstract:We introduce SeaSplat, a method to enable real-time rendering of underwater scenes leveraging recent advances in 3D radiance fields. Underwater scenes are challenging visual environments, as rendering through a medium such as water introduces both range and color dependent effects on image capture. We constrain 3D Gaussian Splatting (3DGS), a recent advance in radiance fields enabling rapid training and real-time rendering of full 3D scenes, with a physically grounded underwater image formation model. Applying SeaSplat to the real-world scenes from SeaThru-NeRF dataset, a scene collected by an underwater vehicle in the US Virgin Islands, and simulation-degraded real-world scenes, not only do we see increased quantitative performance on rendering novel viewpoints from the scene with the medium present, but are also able to recover the underlying true color of the scene and restore renders to be without the presence of the intervening medium. We show that the underwater image formation helps learn scene structure, with better depth maps, as well as show that our improvements maintain the significant computational improvements afforded by leveraging a 3D Gaussian representation.

* Project page here: https://seasplat.github.io

Via

Access Paper or Ask Questions

ReefGlider: A highly maneuverable vectored buoyancy engine based underwater robot

May 09, 2024

Kevin Macauley, Levi Cai, Peter Adamczyk, Yogesh Girdhar

Abstract:There exists a capability gap in the design of currently available autonomous underwater vehicles (AUV). Most AUVs use a set of thrusters, and optionally control surfaces, to control their depth and pose. AUVs utilizing thrusters can be highly maneuverable, making them well-suited to operate in complex environments such as in close-proximity to coral reefs. However, they are inherently power-inefficient and produce significant noise and disturbance. Underwater gliders, on the other hand, use changes in buoyancy and center of mass, in combination with a control surface to move around. They are extremely power efficient but not very maneuverable. Gliders are designed for long-range missions that do not require precision maneuvering. Furthermore, since gliders only activate the buoyancy engine for small time intervals, they do not disturb the environment and can also be used for passive acoustic observations. In this paper we present ReefGlider, a novel AUV that uses only buoyancy for control but is still highly maneuverable from additional buoyancy control devices. ReefGlider bridges the gap between the capabilities of thruster-driven AUVs and gliders. These combined characteristics make ReefGlider ideal for tasks such as long-term visual and acoustic monitoring of coral reefs. We present the overall design and implementation of the system, as well as provide analysis of some of its capabilities.

* In IEEE International Conference on Robotics and Automation (ICRA), 2024

Via

Access Paper or Ask Questions

Biological Hotspot Mapping in Coral Reefs with Robotic Visual Surveys

May 03, 2023

Daniel Yang, Levi Cai, Stewart Jamieson, Yogesh Girdhar

Figure 1 for Biological Hotspot Mapping in Coral Reefs with Robotic Visual Surveys

Figure 2 for Biological Hotspot Mapping in Coral Reefs with Robotic Visual Surveys

Figure 3 for Biological Hotspot Mapping in Coral Reefs with Robotic Visual Surveys

Abstract:Coral reefs are fast-changing and complex ecosystems that are crucial to monitor and study. Biological hotspot detection can help coral reef managers prioritize limited resources for monitoring and intervention tasks. Here, we explore the use of autonomous underwater vehicles (AUVs) with cameras, coupled with visual detectors and photogrammetry, to map and identify these hotspots. This approach can provide high spatial resolution information in fast feedback cycles. To the best of our knowledge, we present one of the first attempts at using an AUV to gather visually-observed, fine-grain biological hotspot maps in concert with topography of a coral reefs. Our hotspot maps correlate with rugosity, an established proxy metric for coral reef biodiversity and abundance, as well as with our visual inspections of the 3D reconstruction. We also investigate issues of scaling this approach when applied to new reefs by using these visual detectors pre-trained on large public datasets.

Via

Access Paper or Ask Questions

DeepSeeColor: Realtime Adaptive Color Correction for Autonomous Underwater Vehicles via Deep Learning Methods

Mar 07, 2023

Stewart Jamieson, Jonathan P. How, Yogesh Girdhar

Abstract:Successful applications of complex vision-based behaviours underwater have lagged behind progress in terrestrial and aerial domains. This is largely due to the degraded image quality resulting from the physical phenomena involved in underwater image formation. Spectrally-selective light attenuation drains some colors from underwater images while backscattering adds others, making it challenging to perform vision-based tasks underwater. State-of-the-art methods for underwater color correction optimize the parameters of image formation models to restore the full spectrum of color to underwater imagery. However, these methods have high computational complexity that is unfavourable for realtime use by autonomous underwater vehicles (AUVs), as a result of having been primarily designed for offline color correction. Here, we present DeepSeeColor, a novel algorithm that combines a state-of-the-art underwater image formation model with the computational efficiency of deep learning frameworks. In our experiments, we show that DeepSeeColor offers comparable performance to the popular "Sea-Thru" algorithm (Akkaynak & Treibitz, 2019) while being able to rapidly process images at up to 60Hz, thus making it suitable for use onboard AUVs as a preprocessing step to enable more robust vision-based behaviours.

* 7 pages, 5 figures, 2 tables. Presented at the 2023 International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Semi-Supervised Visual Tracking of Marine Animals using Autonomous Underwater Vehicles

Feb 14, 2023

Levi Cai, Nathan E. McGuire, Roger Hanlon, T. Aran Mooney, Yogesh Girdhar

Abstract:In-situ visual observations of marine organisms is crucial to developing behavioural understandings and their relations to their surrounding ecosystem. Typically, these observations are collected via divers, tags, and remotely-operated or human-piloted vehicles. Recently, however, autonomous underwater vehicles equipped with cameras and embedded computers with GPU capabilities are being developed for a variety of applications, and in particular, can be used to supplement these existing data collection mechanisms where human operation or tags are more difficult. Existing approaches have focused on using fully-supervised tracking methods, but labelled data for many underwater species are severely lacking. Semi-supervised trackers may offer alternative tracking solutions because they require less data than fully-supervised counterparts. However, because there are not existing realistic underwater tracking datasets, the performance of semi-supervised tracking algorithms in the marine domain is not well understood. To better evaluate their performance and utility, in this paper we provide (1) a novel dataset specific to marine animals located at http://warp.whoi.edu/vmat/, (2) an evaluation of state-of-the-art semi-supervised algorithms in the context of underwater animal tracking, and (3) an evaluation of real-world performance through demonstrations using a semi-supervised algorithm on-board an autonomous underwater vehicle to track marine animals in the wild.

* To appear in IJCV SI: Animal Tracking

Via

Access Paper or Ask Questions

Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments through Online Matching of Learned Representations

Mar 27, 2021

Stewart Jamieson, Kaveh Fathian, Kasra Khosoussi, Jonathan P. How, Yogesh Girdhar

Figure 1 for Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments through Online Matching of Learned Representations

Figure 2 for Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments through Online Matching of Learned Representations

Figure 3 for Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments through Online Matching of Learned Representations

Figure 4 for Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments through Online Matching of Learned Representations

Abstract:We present a solution to multi-robot distributed semantic mapping of novel and unfamiliar environments. Most state-of-the-art semantic mapping systems are based on supervised learning algorithms that cannot classify novel observations online. While unsupervised learning algorithms can invent labels for novel observations, approaches to detect when multiple robots have independently developed their own labels for the same new class are prone to erroneous or inconsistent matches. These issues worsen as the number of robots in the system increases and prevent fusing the local maps produced by each robot into a consistent global map, which is crucial for cooperative planning and joint mission summarization. Our proposed solution overcomes these obstacles by having each robot learn an unsupervised semantic scene model online and use a multiway matching algorithm to identify consistent sets of matches between learned semantic labels belonging to different robots. Compared to the state of the art, the proposed solution produces 20-60% higher quality global maps that do not degrade even as many more local maps are fused.

* 7 pages, 6 figures, 1 table; accepted for presentation in IEEE Int. Conf. on Robotics and Automation, ICRA '21, Xi'an, China, June 2021

Via

Access Paper or Ask Questions

Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor

Dec 14, 2020

Francois Robert Hogan, Michael Jenkin, Sahand Rezaei-Shoshtari, Yogesh Girdhar, David Meger, Gregory Dudek

Figure 1 for Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor

Figure 2 for Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor

Figure 3 for Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor

Figure 4 for Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor

Abstract:We introduce a new class of vision-based sensor and associated algorithmic processes that combine visual imaging with high-resolution tactile sending, all in a uniform hardware and computational architecture. We demonstrate the sensor's efficacy for both multi-modal object recognition and metrology. Object recognition is typically formulated as an unimodal task, but by combining two sensor modalities we show that we can achieve several significant performance improvements. This sensor, named the See-Through-your-Skin sensor (STS), is designed to provide rich multi-modal sensing of contact surfaces. Inspired by recent developments in optical tactile sensing technology, we address a key missing feature of these sensors: the ability to capture a visual perspective of the region beyond the contact surface. Whereas optical tactile sensors are typically opaque, we present a sensor with a semitransparent skin that has the dual capabilities of acting as a tactile sensor and/or as a visual camera depending on its internal lighting conditions. This paper details the design of the sensor, showcases its dual sensing capabilities, and presents a deep learning architecture that fuses vision and touch. We validate the ability of the sensor to classify household objects, recognize fine textures, and infer their physical properties both through numerical simulations and experiments with a smart countertop prototype.

* A version of this paper appears in WACV 2021

Via

Access Paper or Ask Questions

Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations

Mar 26, 2020

John E. San Soucie, Heidi M. Sosik, Yogesh Girdhar

Figure 1 for Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations

Figure 2 for Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations

Figure 3 for Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations

Figure 4 for Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations

Abstract:We propose a generative model for the spatio-temporal distribution of high dimensional categorical observations. These are commonly produced by robots equipped with an imaging sensor such as a camera, paired with an image classifier, potentially producing observations over thousands of categories. The proposed approach combines the use of Dirichlet distributions to model sparse co-occurrence relations between the observed categories using a latent variable, and Gaussian processes to model the latent variable's spatio-temporal distribution. Experiments in this paper show that the resulting model is able to efficiently and accurately approximate the temporal distribution of high dimensional categorical measurements such as taxonomic observations of microscopic organisms in the ocean, even in unobserved (held out) locations, far from other samples. This work's primary motivation is to enable deployment of informative path planning techniques over high dimensional categorical fields, which until now have been limited to scalar or low dimensional vector observations.

* 8 pages, 10 figures, accepted to proceedings of International Conference on Robotics and Automation (ICRA) 2020

Via

Access Paper or Ask Questions