Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vittorio Giammarino

Beyond Domain Randomization: Event-Inspired Perception for Visually Robust Adversarial Imitation from Videos

May 24, 2025

Andrea Ramazzina, Vittorio Giammarino, Matteo El-Hariry, Mario Bijelic

Abstract:Imitation from videos often fails when expert demonstrations and learner environments exhibit domain shifts, such as discrepancies in lighting, color, or texture. While visual randomization partially addresses this problem by augmenting training data, it remains computationally intensive and inherently reactive, struggling with unseen scenarios. We propose a different approach: instead of randomizing appearances, we eliminate their influence entirely by rethinking the sensory representation itself. Inspired by biological vision systems that prioritize temporal transients (e.g., retinal ganglion cells) and by recent sensor advancements, we introduce event-inspired perception for visually robust imitation. Our method converts standard RGB videos into a sparse, event-based representation that encodes temporal intensity gradients, discarding static appearance features. This biologically grounded approach disentangles motion dynamics from visual style, enabling robust visual imitation from observations even in the presence of visual mismatches between expert and agent environments. By training policies on event streams, we achieve invariance to appearance-based distractors without requiring computationally expensive and environment-specific data augmentation techniques. Experiments across the DeepMind Control Suite and the Adroit platform for dynamic dexterous manipulation show the efficacy of our method. Our code is publicly available at Eb-LAIfO.

Via

Access Paper or Ask Questions

Provably Efficient Off-Policy Adversarial Imitation Learning with Convergence Guarantees

May 26, 2024

Yilei Chen, Vittorio Giammarino, James Queeney, Ioannis Ch. Paschalidis

Figure 1 for Provably Efficient Off-Policy Adversarial Imitation Learning with Convergence Guarantees

Figure 2 for Provably Efficient Off-Policy Adversarial Imitation Learning with Convergence Guarantees

Abstract:Adversarial Imitation Learning (AIL) faces challenges with sample inefficiency because of its reliance on sufficient on-policy data to evaluate the performance of the current policy during reward function updates. In this work, we study the convergence properties and sample complexity of off-policy AIL algorithms. We show that, even in the absence of importance sampling correction, reusing samples generated by the $o(\sqrt{K})$ most recent policies, where $K$ is the number of iterations of policy updates and reward updates, does not undermine the convergence guarantees of this class of algorithms. Furthermore, our results indicate that the distribution shift error induced by off-policy updates is dominated by the benefits of having more data available. This result provides theoretical support for the sample efficiency of off-policy AIL algorithms. To the best of our knowledge, this is the first work that provides theoretical guarantees for off-policy AIL algorithms.

Via

Access Paper or Ask Questions

Reinforcement Learning-based Receding Horizon Control using Adaptive Control Barrier Functions for Safety-Critical Systems

Mar 26, 2024

Ehsan Sabouni, H. M. Sabbir Ahmad, Vittorio Giammarino, Christos G. Cassandras, Ioannis Ch. Paschalidis, Wenchao Li

Abstract:Optimal control methods provide solutions to safety-critical problems but easily become intractable. Control Barrier Functions (CBFs) have emerged as a popular technique that facilitates their solution by provably guaranteeing safety, through their forward invariance property, at the expense of some performance loss. This approach involves defining a performance objective alongside CBF-based safety constraints that must always be enforced. Unfortunately, both performance and solution feasibility can be significantly impacted by two key factors: (i) the selection of the cost function and associated parameters, and (ii) the calibration of parameters within the CBF-based constraints, which capture the trade-off between performance and conservativeness. %as well as infeasibility. To address these challenges, we propose a Reinforcement Learning (RL)-based Receding Horizon Control (RHC) approach leveraging Model Predictive Control (MPC) with CBFs (MPC-CBF). In particular, we parameterize our controller and use bilevel optimization, where RL is used to learn the optimal parameters while MPC computes the optimal control input. We validate our method by applying it to the challenging automated merging control problem for Connected and Automated Vehicles (CAVs) at conflicting roadways. Results demonstrate improved performance and a significant reduction in the number of infeasible cases compared to traditional heuristic approaches used for tuning CBF-based controllers, showcasing the effectiveness of the proposed method.

Via

Access Paper or Ask Questions

A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations

Feb 29, 2024

Erhan Can Ozcan, Vittorio Giammarino, James Queeney, Ioannis Ch. Paschalidis

Figure 1 for A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations

Figure 2 for A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations

Figure 3 for A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations

Figure 4 for A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations

Abstract:This paper investigates how to incorporate expert observations (without explicit information on expert actions) into a deep reinforcement learning setting to improve sample efficiency. First, we formulate an augmented policy loss combining a maximum entropy reinforcement learning objective with a behavioral cloning loss that leverages a forward dynamics model. Then, we propose an algorithm that automatically adjusts the weights of each component in the augmented loss function. Experiments on a variety of continuous control tasks demonstrate that the proposed algorithm outperforms various benchmarks by effectively utilizing available expert observations.

Via

Access Paper or Ask Questions

Adversarial Imitation Learning from Visual Observations using Latent Information

Sep 29, 2023

Vittorio Giammarino, James Queeney, Ioannis Ch. Paschalidis

Abstract:We focus on the problem of imitation learning from visual observations, where the learning agent has access to videos of experts as its sole learning source. The challenges of this framework include the absence of expert actions and the partial observability of the environment, as the ground-truth states can only be inferred from pixels. To tackle this problem, we first conduct a theoretical analysis of imitation learning in partially observable environments. We establish upper bounds on the suboptimality of the learning agent with respect to the divergence between the expert and the agent latent state-transition distributions. Motivated by this analysis, we introduce an algorithm called Latent Adversarial Imitation from Observations, which combines off-policy adversarial imitation techniques with a learned latent representation of the agent's state from sequences of observations. In experiments on high-dimensional continuous robotic tasks, we show that our algorithm matches state-of-the-art performance while providing significant computational advantages. Additionally, we show how our method can be used to improve the efficiency of reinforcement learning from pixels by leveraging expert videos. To ensure reproducibility, we provide free access to our code.

Via

Access Paper or Ask Questions

A Reinforcement Learning Approach for Robotic Unloading from Visual Observations

Sep 12, 2023

Vittorio Giammarino, Alberto Giammarino, Matthew Pearce

Abstract:In this work, we focus on a robotic unloading problem from visual observations, where robots are required to autonomously unload stacks of parcels using RGB-D images as their primary input source. While supervised and imitation learning have accomplished good results in these types of tasks, they heavily rely on labeled data, which are challenging to obtain in realistic scenarios. Our study aims to develop a sample efficient controller framework that can learn unloading tasks without the need for labeled data during the learning process. To tackle this challenge, we propose a hierarchical controller structure that combines a high-level decision-making module with classical motion control. The high-level module is trained using Deep Reinforcement Learning (DRL), wherein we incorporate a safety bias mechanism and design a reward function tailored to this task. Our experiments demonstrate that both these elements play a crucial role in achieving improved learning performance. Furthermore, to ensure reproducibility and establish a benchmark for future research, we provide free access to our code and simulation.

Via

Access Paper or Ask Questions

On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

Sep 27, 2022

Vittorio Giammarino

Figure 1 for On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

Figure 2 for On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

Figure 3 for On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

Figure 4 for On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

Abstract:We investigate the use of animals videos to improve efficiency and performance in Reinforcement Learning (RL). Under a theoretical perspective, we motivate the use of weighted policy optimization for off-policy RL, describe the main challenges when learning from videos and propose solutions. We test our ideas in offline and online RL and show encouraging results on a series of 2D navigation tasks.

Via

Access Paper or Ask Questions

Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Sep 25, 2022

Vittorio Giammarino

Figure 1 for Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Figure 2 for Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Figure 3 for Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Figure 4 for Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Abstract:We focus on an unloading problem, typical of the logistics sector, modeled as a sequential pick-and-place task. In this type of task, modern machine learning techniques have shown to work better than classic systems since they are more adaptable to stochasticity and better able to cope with large uncertainties. More specifically, supervised and imitation learning have achieved outstanding results in this regard, with the shortcoming of requiring some form of supervision which is not always obtainable for all settings. On the other hand, reinforcement learning (RL) requires much milder form of supervision but still remains impracticable due to its inefficiency. In this paper, we propose and theoretically motivate a novel Unsupervised Reward Shaping algorithm from expert's observations which relaxes the level of supervision required by the agent and works on improving RL performance in our task.

Via

Access Paper or Ask Questions

Learning from humans: combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

Mar 30, 2022

Vittorio Giammarino, Matthew F Dunne, Kylie N Moore, Michael E Hasselmo, Chantal E Stern, Ioannis Ch. Paschalidis

Figure 1 for Learning from humans: combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

Figure 2 for Learning from humans: combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

Figure 3 for Learning from humans: combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

Figure 4 for Learning from humans: combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

Abstract:We develop a method to learn bio-inspired foraging policies using human data. We conduct an experiment where humans are virtually immersed in an open field foraging environment and are trained to collect the highest amount of rewards. A Markov Decision Process (MDP) framework is introduced to model the human decision dynamics. Then, Imitation Learning (IL) based on maximum likelihood estimation is used to train Neural Networks (NN) that map human decisions to observed states. The results show that passive imitation substantially underperforms humans. We further refine the human-inspired policies via Reinforcement Learning (RL), using on-policy algorithms that are more suitable to learn from pre-trained networks. We show that the combination of IL and RL can match human results and that good performance strongly depends on an egocentric representation of the environment. The developed methodology can be used to efficiently learn policies for unmanned vehicles which have to solve missions in an open field environment.

* 24 pages, 15 figures

Via

Access Paper or Ask Questions

Online Baum-Welch algorithm for Hierarchical Imitation Learning

Mar 22, 2021

Vittorio Giammarino, Ioannis Ch. Paschalidis

Figure 1 for Online Baum-Welch algorithm for Hierarchical Imitation Learning

Figure 2 for Online Baum-Welch algorithm for Hierarchical Imitation Learning

Figure 3 for Online Baum-Welch algorithm for Hierarchical Imitation Learning

Abstract:The options framework for hierarchical reinforcement learning has increased its popularity in recent years and has made improvements in tackling the scalability problem in reinforcement learning. Yet, most of these recent successes are linked with a proper options initialization or discovery. When an expert is available, the options discovery problem can be addressed by learning an options-type hierarchical policy directly from expert demonstrations. This problem is referred to as hierarchical imitation learning and can be handled as an inference problem in a Hidden Markov Model, which is done via an Expectation-Maximization type algorithm. In this work, we propose a novel online algorithm to perform hierarchical imitation learning in the options framework. Further, we discuss the benefits of such an algorithm and compare it with its batch version in classical reinforcement learning benchmarks. We show that this approach works well in both discrete and continuous environments and, under certain conditions, it outperforms the batch version.

* 15 pages, 2 figures

Via

Access Paper or Ask Questions