Abstract:Large-scale orchard production requires timely and precise disease monitoring, yet routine manual scouting is labor-intensive and financially impractical at the scale of modern operations. As a result, disease outbreaks are often detected late and tracked at coarse spatial resolutions, typically at the orchard-block level. We present an autonomous mobile active perception system for targeted disease detection and mapping in dormant apple trees, demonstrated on one of the most devastating diseases affecting apple today -- fire blight. The system integrates flash-illuminated stereo RGB sensing, real-time depth estimation, instance-level segmentation, and confidence-aware semantic 3D mapping to achieve precise localization of disease symptoms. Semantic predictions are fused into the volumetric occupancy map representation enabling the tracking of both occupancy and per-voxel semantic confidence, building actionable spatial maps for growers. To actively refine observations within complex canopies, we evaluate three viewpoint planning strategies within a unified perception-action loop: a deterministic geometric baseline, a volumetric next-best-view planner that maximizes unknown-space reduction, and a semantic next-best-view planner that prioritizes low-confidence symptomatic regions. Experiments on a fabricated lab tree and five simulated symptomatic trees demonstrate reliable symptom localization and mapping as a precursor to a field evaluation. In simulation, the semantic planner achieves the highest F1 score (0.6106) after 30 viewpoints, while the volumetric planner achieves the highest ROI coverage (85.82\%). In the lab setting, the semantic planner attains the highest final F1 (0.9058), with both next-best-view planners substantially improving coverage over the baseline.
Abstract:Monitoring and controlling invasive tree species across large forests, parks, and trail networks is challenging due to limited accessibility, reliance on manual scouting, and degraded under-canopy GNSS. We present MapForest, a modular field robotics system that transforms multi-modal sensor data into GIS-ready invasive-species maps. Our system features: (i) a compact, platform-agnostic sensing payload that can be rapidly mounted on UAV, bicycle, or backpack platforms, and (ii) a software pipeline comprising LiDAR-inertial mapping, image-based invasive-species detection, and georeferenced map generation. To ensure reliable operation in GNSS-intermittent environments, we enhance a LiDAR-inertial mapping backbone with covariance-aware GNSS factors and robust loss kernels. We train an object detector to detect the Tree-of-Heaven (Ailanthus altissima) from onboard RGB imagery and fuse detections with the reconstructed map to produce geospatial outputs suitable for downstream decision making. We collected a dataset spanning six sites across urban environments, parks, trails, and forests to evaluate individual system modules, and report end-to-end results on two sites containing Tree-of-Heaven. The enhanced mapping module achieved a trajectory deviation error of 1.95 m over a 1.2 km forest traversal, and the Tree-of-Heaven detector achieved an F1 score of 0.653. The datasets and associated tooling are released to support reproducible research in forest mapping and invasive-species monitoring.
Abstract:Agricultural robotics has emerged as a critical solution to the labor shortages and rising costs associated with manual crop harvesting. Bell pepper harvesting, in particular, is a labor-intensive task, accounting for up to 50% of total production costs. While automated solutions have shown promise in controlled greenhouse environments, harvesting in unstructured outdoor farms remains an open challenge due to environmental variability and occlusion. This paper presents VADER (Vision-guided Autonomous Dual-arm Extraction Robot), a dual-arm mobile manipulation system designed specifically for the autonomous harvesting of bell peppers in outdoor environments. The system integrates a robust perception pipeline coupled with a dual-arm planning framework that coordinates a gripping arm and a cutting arm for extraction. We validate the system through trials in various realistic conditions, demonstrating a harvest success rate exceeding 60% with a cycle time of under 100 seconds per fruit, while also featuring a teleoperation fail-safe based on the GELLO teleoperation framework to ensure robustness. To support robust perception, we contribute a hierarchically structured dataset of over 3,200 images spanning indoor and outdoor domains, pairing wide-field scene images with close-up pepper images to enable a coarse-to-fine training strategy from fruit detection to high-precision pose estimation. The code and dataset will be made publicly available upon acceptance.




Abstract:This paper presents a pipeline that combines high-resolution orthomosaic maps generated from UAS imagery with GPS-based global navigation to guide a skid-steered ground robot. We evaluated three path planning strategies: A* Graph search, Deep Q-learning (DQN) model, and Heuristic search, benchmarking them on planning time and success rate in realistic simulation environments. Experimental results reveal that the Heuristic search achieves the fastest planning times (0.28 ms) and a 100% success rate, while the A* approach delivers near-optimal performance, and the DQN model, despite its adaptability, incurs longer planning delays and occasional suboptimal routing. These results highlight the advantages of deterministic rule-based methods in geometrically constrained crop-row environments and lay the groundwork for future hybrid strategies in precision agriculture.




Abstract:Sim2Real transfer, particularly for manipulation policies relying on RGB images, remains a critical challenge in robotics due to the significant domain shift between synthetic and real-world visual data. In this paper, we propose SplatSim, a novel framework that leverages Gaussian Splatting as the primary rendering primitive to reduce the Sim2Real gap for RGB-based manipulation policies. By replacing traditional mesh representations with Gaussian Splats in simulators, SplatSim produces highly photorealistic synthetic data while maintaining the scalability and cost-efficiency of simulation. We demonstrate the effectiveness of our framework by training manipulation policies within SplatSim}and deploying them in the real world in a zero-shot manner, achieving an average success rate of 86.25%, compared to 97.5% for policies trained on real-world data.




Abstract:Autonomous navigation is crucial for various robotics applications in agriculture. However, many existing methods depend on RTK-GPS systems, which are expensive and susceptible to poor signal coverage. This paper introduces a state-of-the-art LiDAR-based navigation system that can achieve over-canopy autonomous navigation in row-crop fields, even when the canopy fully blocks the interrow spacing. Our crop row detection algorithm can detect crop rows across diverse scenarios, encompassing various crop types, growth stages, weed presence, and discontinuities within the crop rows. Without utilizing the global localization of the robot, our navigation system can perform autonomous navigation in these challenging scenarios, detect the end of the crop rows, and navigate to the next crop row autonomously, providing a crop-agnostic approach to navigate the whole row-crop field. This navigation system has undergone tests in various simulated agricultural fields, achieving an average of $2.98cm$ autonomous driving accuracy without human intervention on the custom Amiga robot. In addition, the qualitative results of our crop row detection algorithm from the actual soybean fields validate our LiDAR-based crop row detection algorithm's potential for practical agricultural applications.




Abstract:Dormant season grapevine pruning requires skilled seasonal workers during the winter season which are becoming less available. As workers hasten to prune more vines in less time amid to the short-term seasonal hiring culture and low wages, vines are often pruned inconsistently leading to imbalanced grapevines. In addition to this, currently existing mechanical methods cannot selectively prune grapevines and manual follow-up operations are often required that further increase production cost. In this paper, we present the design and field evaluation of a rugged, and fully autonomous robot for end-to-end pruning of dormant season grapevines. The proposed design incorporates novel camera systems, a kinematically redundant manipulator, a ground robot, and novel algorithms in the perception system. The presented research prototype robot system was able to spur prune a row of vines from both sides completely in 213 sec/vine with a total pruning accuracy of 87%. Initial field tests of the autonomous system in a commercial vineyard have shown significant variability reduction in dormant season pruning when compared to mechanical pre-pruning trials. The design approach, system components, lessons learned, future enhancements as well as a brief economic analysis are described in the manuscript.




Abstract:Object detection and semantic segmentation are two of the most widely adopted deep learning algorithms in agricultural applications. One of the major sources of variability in image quality acquired in the outdoors for such tasks is changing lighting condition that can alter the appearance of the objects or the contents of the entire image. While transfer learning and data augmentation to some extent reduce the need for large amount of data to train deep neural networks, the large variety of cultivars and the lack of shared datasets in agriculture makes wide-scale field deployments difficult. In this paper, we present a high throughput robust active lighting-based camera system that generates consistent images in all lighting conditions. We detail experiments that show the consistency in images quality leading to relatively fewer images to train deep neural networks for the task of object detection. We further present results from field experiment under extreme lighting conditions where images without active lighting significantly lack to provide consistent results. The experimental results show that on average, deep nets for object detection trained on consistent data required nearly four times less data to achieve similar level of accuracy. This proposed work could potentially provide pragmatic solutions to computer vision needs in agriculture.