Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniele Nardi

Sapienza Università di Roma, Rome, Italy

Self-Supervised Data Generation for Precision Agriculture: Blending Simulated Environments with Real Imagery

Feb 25, 2025

Leonardo Saraceni, Ionut Marian Motoi, Daniele Nardi, Thomas Alessandro Ciarfuglia

Abstract:In precision agriculture, the scarcity of labeled data and significant covariate shifts pose unique challenges for training machine learning models. This scarcity is particularly problematic due to the dynamic nature of the environment and the evolving appearance of agricultural subjects as living things. We propose a novel system for generating realistic synthetic data to address these challenges. Utilizing a vineyard simulator based on the Unity engine, our system employs a cut-and-paste technique with geometrical consistency considerations to produce accurate photo-realistic images and labels from synthetic environments to train detection algorithms. This approach generates diverse data samples across various viewpoints and lighting conditions. We demonstrate considerable performance improvements in training a state-of-the-art detector by applying our method to table grapes cultivation. The combination of techniques can be easily automated, an increasingly important consideration for adoption in agricultural practice.

* Presented at 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE)

Via

Access Paper or Ask Questions

Can Robots "Taste" Grapes? Estimating SSC with Simple RGB Sensors

Dec 29, 2024

Thomas Alessandro Ciarfuglia, Ionut Marian Motoi, Leonardo Saraceni, Daniele Nardi

Abstract:In table grape cultivation, harvesting depends on accurately assessing fruit quality. While some characteristics, like color, are visible, others, such as Soluble Solid Content (SSC), or sugar content measured in degrees Brix ({\deg}Brix), require specific tools. SSC is a key quality factor that correlates with ripeness, but lacks a direct causal relationship with color. Hyperspectral cameras can estimate SSC with high accuracy under controlled laboratory conditions, but their practicality in field environments is limited. This study investigates the potential of simple RGB sensors under uncontrolled lighting to estimate SSC and color, enabling cost-effective, robot-assisted harvesting. Over the 2021 and 2022 summer seasons, we collected grape images with corresponding SSC and color labels to evaluate algorithmic solutions for SSC estimation on embedded devices commonly used in robotics and smartphones. Our results demonstrate that SSC can be estimated from visual appearance with human-like performance. We propose computationally efficient histogram-based methods for resource-constrained robots and deep learning approaches for more complex applications.

Via

Access Paper or Ask Questions

Synthetic Data Generation for Anomaly Detection on Table Grapes

Dec 17, 2024

Ionut Marian Motoi, Valerio Belli, Alberto Carpineto, Daniele Nardi, Thomas Alessandro Ciarfuglia

Abstract:Early detection of illnesses and pest infestations in fruit cultivation is critical for maintaining yield quality and plant health. Computer vision and robotics are increasingly employed for the automatic detection of such issues, particularly using data-driven solutions. However, the rarity of these problems makes acquiring and processing the necessary data to train such algorithms a significant obstacle. One solution to this scarcity is the generation of synthetic high-quality anomalous samples. While numerous methods exist for this task, most require highly trained individuals for setup. This work addresses the challenge of generating synthetic anomalies in an automatic fashion that requires only an initial collection of normal and anomalous samples from the user - a task that is straightforward for farmers. We demonstrate the approach in the context of table grape cultivation. Specifically, based on the observation that normal berries present relatively smooth surfaces, while defects result in more complex textures, we introduce a Dual-Canny Edge Detection (DCED) filter. This filter emphasizes the additional texture indicative of diseases, pest infestations, or other defects. Using segmentation masks provided by the Segment Anything Model, we then select and seamlessly blend anomalous berries onto normal ones. We show that the proposed dataset augmentation technique improves the accuracy of an anomaly classifier for table grapes and that the approach can be generalized to other fruit types.

Via

Access Paper or Ask Questions

Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee

Nov 26, 2024

Filippo Ansalone, Flavio Maiorana, Daniele Affinita, Flavio Volpi, Eugenio Bugli, Francesco Petri, Michele Brienza, Valerio Spagnoli, Vincenzo Suriani, Daniele Nardi(+1 more)

Figure 1 for Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee

Figure 2 for Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee

Figure 3 for Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee

Abstract:Advancing human-robot communication is crucial for autonomous systems operating in dynamic environments, where accurate real-time interpretation of human signals is essential. RoboCup provides a compelling scenario for testing these capabilities, requiring robots to understand referee gestures and whistle with minimal network reliance. Using the NAO robot platform, this study implements a two-stage pipeline for gesture recognition through keypoint extraction and classification, alongside continuous convolutional neural networks (CCNNs) for efficient whistle detection. The proposed approach enhances real-time human-robot interaction in a competitive setting like RoboCup, offering some tools to advance the development of autonomous systems capable of cooperating with humans.

* 11th Italian Workshop on Artificial Intelligence and Robotics (AIRO 2024), Published in CEUR Workshop Proceedings AI*IA Series

Via

Access Paper or Ask Questions

EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution

Aug 30, 2024

Francesco Argenziano, Michele Brienza, Vincenzo Suriani, Daniele Nardi, Domenico D. Bloisi

Abstract:Task planning for robots in real-life settings presents significant challenges. These challenges stem from three primary issues: the difficulty in identifying grounded sequences of steps to achieve a goal; the lack of a standardized mapping between high-level actions and low-level commands; and the challenge of maintaining low computational overhead given the limited resources of robotic hardware. We introduce EMPOWER, a framework designed for open-vocabulary online grounding and planning for embodied agents aimed at addressing these issues. By leveraging efficient pre-trained foundation models and a multi-role mechanism, EMPOWER demonstrates notable improvements in grounded planning and execution. Quantitative results highlight the effectiveness of our approach, achieving an average success rate of 0.73 across six different real-life scenarios using a TIAGo robot.

* Accepted at IROS 2024

Via

Access Paper or Ask Questions

Multi-agent Planning using Visual Language Models

Aug 10, 2024

Michele Brienza, Francesco Argenziano, Vincenzo Suriani, Domenico D. Bloisi, Daniele Nardi

Abstract:Large Language Models (LLMs) and Visual Language Models (VLMs) are attracting increasing interest due to their improving performance and applications across various domains and tasks. However, LLMs and VLMs can produce erroneous results, especially when a deep understanding of the problem domain is required. For instance, when planning and perception are needed simultaneously, these models often struggle because of difficulties in merging multi-modal information. To address this issue, fine-tuned models are typically employed and trained on specialized data structures representing the environment. This approach has limited effectiveness, as it can overly complicate the context for processing. In this paper, we propose a multi-agent architecture for embodied task planning that operates without the need for specific data structures as input. Instead, it uses a single image of the environment, handling free-form domains by leveraging commonsense knowledge. We also introduce a novel, fully automatic evaluation procedure, PG2S, designed to better assess the quality of a plan. We validated our approach using the widely recognized ALFRED dataset, comparing PG2S to the existing KAS metric to further evaluate the quality of the generated plans.

Via

Access Paper or Ask Questions

Shape and Style GAN-based Multispectral Data Augmentation for Crop/Weed Segmentation in Precision Farming

Jul 19, 2024

Mulham Fawakherji, Vincenzo Suriani, Daniele Nardi, Domenico Daniele Bloisi

Abstract:The use of deep learning methods for precision farming is gaining increasing interest. However, collecting training data in this application field is particularly challenging and costly due to the need of acquiring information during the different growing stages of the cultivation of interest. In this paper, we present a method for data augmentation that uses two GANs to create artificial images to augment the training data. To obtain a higher image quality, instead of re-creating the entire scene, we take original images and replace only the patches containing objects of interest with artificial ones containing new objects with different shapes and styles. In doing this, we take into account both the foreground (i.e., crop samples) and the background (i.e., the soil) of the patches. Quantitative experiments, conducted on publicly available datasets, demonstrate the effectiveness of the proposed approach. The source code and data discussed in this work are available as open source.

* Crop Protection, Volume 184, October 2024, 106848

Via

Access Paper or Ask Questions

LLCoach: Generating Robot Soccer Plans using Multi-Role Large Language Models

Jun 26, 2024

Michele Brienza, Emanuele Musumeci, Vincenzo Suriani, Daniele Affinita, Andrea Pennisi, Daniele Nardi, Domenico Daniele Bloisi

Abstract:The deployment of robots into human scenarios necessitates advanced planning strategies, particularly when we ask robots to operate in dynamic, unstructured environments. RoboCup offers the chance to deploy robots in one of those scenarios, a human-shaped game represented by a soccer match. In such scenarios, robots must operate using predefined behaviors that can fail in unpredictable conditions. This paper introduces a novel application of Large Language Models (LLMs) to address the challenge of generating actionable plans in such settings, specifically within the context of the RoboCup Standard Platform League (SPL) competitions where robots are required to autonomously execute soccer strategies that emerge from the interactions of individual agents. In particular, we propose a multi-role approach leveraging the capabilities of LLMs to generate and refine plans for a robotic soccer team. The potential of the proposed method is demonstrated through an experimental evaluation,carried out simulating multiple matches where robots with AI-generated plans play against robots running human-built code.

* Accepted at 27th RoboCup International Symposium will be held on 22 July 2024, in Eindhoven, The Netherlands

Via

Access Paper or Ask Questions

Play Everywhere: A Temporal Logic based Game Environment Independent Approach for Playing Soccer with Robots

May 21, 2024

Vincenzo Suriani, Emanuele Musumeci, Daniele Nardi, Domenico Daniele Bloisi

Abstract:Robots playing soccer often rely on hard-coded behaviors that struggle to generalize when the game environment change. In this paper, we propose a temporal logic based approach that allows robots' behaviors and goals to adapt to the semantics of the environment. In particular, we present a hierarchical representation of soccer in which the robot selects the level of operation based on the perceived semantic characteristics of the environment, thus modifying dynamically the set of rules and goals to apply. The proposed approach enables the robot to operate in unstructured environments, just as it happens when humans go from soccer played on an official field to soccer played on a street. Three different use cases set in different scenarios are presented to demonstrate the effectiveness of the proposed approach.

* Lecture Notes in Computer Science ((LNAI,volume 14140)) Included in the following conference series: Robot World Cup RoboCup 2023: Robot World Cup XXVI
* RoboCup 2023: Robot World Cup XXVI Best Paper

Via

Access Paper or Ask Questions

Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery

Apr 08, 2024

Ionut M. Motoi, Leonardo Saraceni, Daniele Nardi, Thomas A. Ciarfuglia

Abstract:Satellite imagery is crucial for tasks like environmental monitoring and urban planning. Typically, it relies on semantic segmentation or Land Use Land Cover (LULC) classification to categorize each pixel. Despite the advancements brought about by Deep Neural Networks (DNNs), their performance in segmentation tasks is hindered by challenges such as limited availability of labeled data, class imbalance and the inherent variability and complexity of satellite images. In order to mitigate those issues, our study explores the effectiveness of a Cut-and-Paste augmentation technique for semantic segmentation in satellite images. We adapt this augmentation, which usually requires labeled instances, to the case of semantic segmentation. By leveraging the connected components in the semantic segmentation labels, we extract instances that are then randomly pasted during training. Using the DynamicEarthNet dataset and a U-Net model for evaluation, we found that this augmentation significantly enhances the mIoU score on the test set from 37.9 to 44.1. This finding highlights the potential of the Cut-and-Paste augmentation to improve the generalization capabilities of semantic segmentation models in satellite imagery.

* Accepted for publication in IEEE 2024 International Geoscience & Remote Sensing Symposium (IGARSS 2024)

Via

Access Paper or Ask Questions