Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hamed Habibi

Motion Control in Multi-Rotor Aerial Robots Using Deep Reinforcement Learning

Feb 09, 2025

Gaurav Shetty, Mahya Ramezani, Hamed Habibi, Holger Voos, Jose Luis Sanchez-Lopez

Abstract:This paper investigates the application of Deep Reinforcement (DRL) Learning to address motion control challenges in drones for additive manufacturing (AM). Drone-based additive manufacturing promises flexible and autonomous material deposition in large-scale or hazardous environments. However, achieving robust real-time control of a multi-rotor aerial robot under varying payloads and potential disturbances remains challenging. Traditional controllers like PID often require frequent parameter re-tuning, limiting their applicability in dynamic scenarios. We propose a DRL framework that learns adaptable control policies for multi-rotor drones performing waypoint navigation in AM tasks. We compare Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3) within a curriculum learning scheme designed to handle increasing complexity. Our experiments show TD3 consistently balances training stability, accuracy, and success, particularly when mass variability is introduced. These findings provide a scalable path toward robust, autonomous drone control in additive manufacturing.

Via

Access Paper or Ask Questions

A Practical Roadmap to Learning from Demonstration for Robotic Manipulators in Manufacturing

Jun 11, 2024

Alireza Barekatain, Hamed Habibi, Holger Voos

Figure 1 for A Practical Roadmap to Learning from Demonstration for Robotic Manipulators in Manufacturing

Figure 2 for A Practical Roadmap to Learning from Demonstration for Robotic Manipulators in Manufacturing

Figure 3 for A Practical Roadmap to Learning from Demonstration for Robotic Manipulators in Manufacturing

Figure 4 for A Practical Roadmap to Learning from Demonstration for Robotic Manipulators in Manufacturing

Abstract:This paper provides a structured and practical roadmap for practitioners to integrate Learning from Demonstration (LfD ) into manufacturing tasks, with a specific focus on industrial manipulators. Motivated by the paradigm shift from mass production to mass customization, it is crucial to have an easy-to-follow roadmap for practitioners with moderate expertise, to transform existing robotic processes to customizable LfD-based solutions. To realize this transformation, we devise the key questions of "What to Demonstrate", "How to Demonstrate", "How to Learn", and "How to Refine". To follow through these questions, our comprehensive guide offers a questionnaire-style approach, highlighting key steps from problem definition to solution refinement. The paper equips both researchers and industry professionals with actionable insights to deploy LfD-based solutions effectively. By tailoring the refinement criteria to manufacturing settings, the paper addresses related challenges and strategies for enhancing LfD performance in manufacturing contexts.

* 26 pages, 6 figures

Via

Access Paper or Ask Questions

DFL-TORO: A One-Shot Demonstration Framework for Learning Time-Optimal Robotic Manufacturing Tasks

Sep 18, 2023

Alireza Barekatain, Hamed Habibi, Holger Voos

Figure 1 for DFL-TORO: A One-Shot Demonstration Framework for Learning Time-Optimal Robotic Manufacturing Tasks

Figure 2 for DFL-TORO: A One-Shot Demonstration Framework for Learning Time-Optimal Robotic Manufacturing Tasks

Figure 3 for DFL-TORO: A One-Shot Demonstration Framework for Learning Time-Optimal Robotic Manufacturing Tasks

Figure 4 for DFL-TORO: A One-Shot Demonstration Framework for Learning Time-Optimal Robotic Manufacturing Tasks

Abstract:This paper presents DFL-TORO, a novel Demonstration Framework for Learning Time-Optimal Robotic tasks via One-shot kinesthetic demonstration. It aims at optimizing the process of Learning from Demonstration (LfD), applied in the manufacturing sector. As the effectiveness of LfD is challenged by the quality and efficiency of human demonstrations, our approach offers a streamlined method to intuitively capture task requirements from human teachers, by reducing the need for multiple demonstrations. Furthermore, we propose an optimization-based smoothing algorithm that ensures time-optimal and jerk-regulated demonstration trajectories, while also adhering to the robot's kinematic constraints. The result is a significant reduction in noise, thereby boosting the robot's operation efficiency. Evaluations using a Franka Emika Research 3 (FR3) robot for a reaching task further substantiate the efficacy of our framework, highlighting its potential to transform kinesthetic demonstrations in contemporary manufacturing environments.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions

Task-Oriented Communication Design at Scale

May 15, 2023

Arsham Mostaani, Thang X. Vu, Hamed Habibi, Symeon Chatzinotas, Bjorn Ottersten

Figure 1 for Task-Oriented Communication Design at Scale

Figure 2 for Task-Oriented Communication Design at Scale

Figure 3 for Task-Oriented Communication Design at Scale

Figure 4 for Task-Oriented Communication Design at Scale

Abstract:With countless promising applications in various domains such as IoT and industry 4.0, task-oriented communication design (TOCD) is getting accelerated attention from the research community. This paper presents a novel approach for designing scalable task-oriented quantization and communications in cooperative multi-agent systems (MAS). The proposed approach utilizes the TOCD framework and the value of information (VoI) concept to enable efficient communication of quantized observations among agents while maximizing the average return performance of the MAS, a parameter that quantifies the MAS's task effectiveness. The computational complexity of learning the VoI, however, grows exponentially with the number of agents. Thus, we propose a three-step framework: i) learning the VoI (using reinforcement learning (RL)) for a two-agent system, ii) designing the quantization policy for an $N$-agent MAS using the learned VoI for a range of bit-budgets and, (iii) learning the agents' control policies using RL while following the designed quantization policies in the earlier step. We observe that one can reduce the computational cost of obtaining the value of information by exploiting insights gained from studying a similar two-agent system - instead of the original $N$-agent system. We then quantize agents' observations such that their more valuable observations are communicated more precisely. Our analytical results show the applicability of the proposed framework under a wide range of problems. Numerical results show striking improvements in reducing the computational complexity of obtaining VoI needed for the TOCD in a MAS problem without compromising the average return performance of the MAS.

Via

Access Paper or Ask Questions

UAV Path Planning Employing MPC- Reinforcement Learning Method Considering Collision Avoidance

Mar 07, 2023

Mahya Ramezani, Hamed Habibi, Jose luis Sanchez Lopez, Holger Voos

Abstract:In this paper, we tackle the problem of Unmanned Aerial (UA V) path planning in complex and uncertain environments by designing a Model Predictive Control (MPC), based on a Long-Short-Term Memory (LSTM) network integrated into the Deep Deterministic Policy Gradient algorithm. In the proposed solution, LSTM-MPC operates as a deterministic policy within the DDPG network, and it leverages a predicting pool to store predicted future states and actions for improved robustness and efficiency. The use of the predicting pool also enables the initialization of the critic network, leading to improved convergence speed and reduced failure rate compared to traditional reinforcement learning and deep reinforcement learning methods. The effectiveness of the proposed solution is evaluated by numerical simulations.

Via

Access Paper or Ask Questions

An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Mar 03, 2023

D. M. K. K. Venkateswara Rao, Hamed Habibi, Jose Luis Sanchez-Lopez, Holger Voos

Figure 1 for An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Figure 2 for An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Figure 3 for An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Figure 4 for An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Abstract:This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcribed into a nonlinear programming problem using Chebyshev pseudospectral method. The state and control histories are approximated by using Lagrange polynomials and the collocation points are used to satisfy constraints. A novel sigmoid-type collision avoidance constraint is proposed to overcome the drawbacks of Lagrange polynomial approximation in pseudospectral methods that only guarantees inequality constraint satisfaction only at nodal points. Automatic differentiation of cost function and constraints is used to quickly determine their gradient and Jacobian, respectively. An APF method is used to update the optimal control inputs for guaranteeing collision avoidance. The trajectory optimization and APF method are implemented in a closed-loop fashion continuously, but in parallel at moderate and high frequencies, respectively. The initial guess for the optimization is provided based on the previous solution. The proposed approach is tested and validated through indoor experiments.

Via

Access Paper or Ask Questions