Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vignesh Prasad

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Mar 13, 2025

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad, Georgia Chalvatzaki

Abstract:When interacting with objects, humans effectively reason about which regions of objects are viable for an intended action, i.e., the affordance regions of the object. They can also account for subtle differences in object regions based on the task to be performed and whether one or two hands need to be used. However, current vision-based affordance prediction methods often reduce the problem to naive object part segmentation. In this work, we propose a framework for extracting affordance data from human activity video datasets. Our extracted 2HANDS dataset contains precise object affordance region segmentations and affordance class-labels as narrations of the activity performed. The data also accounts for bimanual actions, i.e., two hands co-ordinating and interacting with one or more objects. We present a VLM-based affordance prediction model, 2HandedAfforder, trained on the dataset and demonstrate superior performance over baselines in affordance region segmentation for various activities. Finally, we show that our predicted affordance regions are actionable, i.e., can be used by an agent performing a task, through demonstration in robotic manipulation scenarios.

* Project site: https://sites.google.com/view/2handedafforder

Via

Access Paper or Ask Questions

6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting

Dec 02, 2024

Yufeng Jin, Vignesh Prasad, Snehal Jauhri, Mathias Franzius, Georgia Chalvatzaki

Abstract:Efficient and accurate object pose estimation is an essential component for modern vision systems in many applications such as Augmented Reality, autonomous driving, and robotics. While research in model-based 6D object pose estimation has delivered promising results, model-free methods are hindered by the high computational load in rendering and inferring consistent poses of arbitrary objects in a live RGB-D video stream. To address this issue, we present 6DOPE-GS, a novel method for online 6D object pose estimation \& tracking with a single RGB-D camera by effectively leveraging advances in Gaussian Splatting. Thanks to the fast differentiable rendering capabilities of Gaussian Splatting, 6DOPE-GS can simultaneously optimize for 6D object poses and 3D object reconstruction. To achieve the necessary efficiency and accuracy for live tracking, our method uses incremental 2D Gaussian Splatting with an intelligent dynamic keyframe selection procedure to achieve high spatial object coverage and prevent erroneous pose updates. We also propose an opacity statistic-based pruning mechanism for adaptive Gaussian density control, to ensure training stability and efficiency. We evaluate our method on the HO3D and YCBInEOAT datasets and show that 6DOPE-GS matches the performance of state-of-the-art baselines for model-free simultaneous 6D pose tracking and reconstruction while providing a 5$\times$ speedup. We also demonstrate the method's suitability for live, dynamic object tracking and reconstruction in a real-world setting.

Via

Access Paper or Ask Questions

ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching

Sep 06, 2024

Niklas Funk, Julen Urain, Joao Carvalho, Vignesh Prasad, Georgia Chalvatzaki, Jan Peters

Figure 1 for ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching

Figure 2 for ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching

Figure 3 for ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching

Figure 4 for ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching

Abstract:Spatial understanding is a critical aspect of most robotic tasks, particularly when generalization is important. Despite the impressive results of deep generative models in complex manipulation tasks, the absence of a representation that encodes intricate spatial relationships between observations and actions often limits spatial generalization, necessitating large amounts of demonstrations. To tackle this problem, we introduce a novel policy class, ActionFlow. ActionFlow integrates spatial symmetry inductive biases while generating expressive action sequences. On the representation level, ActionFlow introduces an SE(3) Invariant Transformer architecture, which enables informed spatial reasoning based on the relative SE(3) poses between observations and actions. For action generation, ActionFlow leverages Flow Matching, a state-of-the-art deep generative model known for generating high-quality samples with fast inference - an essential property for feedback control. In combination, ActionFlow policies exhibit strong spatial and locality biases and SE(3)-equivariant action generation. Our experiments demonstrate the effectiveness of ActionFlow and its two main components on several simulated and real-world robotic manipulation tasks and confirm that we can obtain equivariant, accurate, and efficient policies with spatially symmetric flow matching. Project website: https://flowbasedpolicies.github.io/

Via

Access Paper or Ask Questions

MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations

Jul 10, 2024

Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Ruth Stock-Homburg, Jan Peters, Georgia Chalvatzaki

Abstract:Shared dynamics models are important for capturing the complexity and variability inherent in Human-Robot Interaction (HRI). Therefore, learning such shared dynamics models can enhance coordination and adaptability to enable successful reactive interactions with a human partner. In this work, we propose a novel approach for learning a shared latent space representation for HRIs from demonstrations in a Mixture of Experts fashion for reactively generating robot actions from human observations. We train a Variational Autoencoder (VAE) to learn robot motions regularized using an informative latent space prior that captures the multimodality of the human observations via a Mixture Density Network (MDN). We show how our formulation derives from a Gaussian Mixture Regression formulation that is typically used approaches for learning HRI from demonstrations such as using an HMM/GMM for learning a joint distribution over the actions of the human and the robot. We further incorporate an additional regularization to prevent "mode collapse", a common phenomenon when using latent space mixture models with VAEs. We find that our approach of using an informative MDN prior from human observations for a VAE generates more accurate robot motions compared to previous HMM-based or recurrent approaches of learning shared latent representations, which we validate on various HRI datasets involving interactions such as handshakes, fistbumps, waving, and handovers. Further experiments in a real-world human-to-robot handover scenario show the efficacy of our approach for generating successful interactions with four different human interaction partners.

* Preprint version of paper accepted at IEEE RAL. Project URL: https://bit.ly/MoVEInt

Via

Access Paper or Ask Questions

Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Feb 22, 2024

Yasemin Göksu, Antonio De Almeida Correia, Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Jan Peters, Georgia Chalvatzaki

Figure 1 for Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Figure 2 for Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Abstract:Bimanual handovers are crucial for transferring large, deformable or delicate objects. This paper proposes a framework for generating kinematically constrained human-like bimanual robot motions to ensure seamless and natural robot-to-human object handovers. We use a Hidden Semi-Markov Model (HSMM) to reactively generate suitable response trajectories for a robot based on the observed human partner's motion. The trajectories are adapted with task space constraints to ensure accurate handovers. Results from a pilot study show that our approach is perceived as more human--like compared to a baseline Inverse Kinematics approach.

* Accepted as a Late Breaking Report in The ACM/IEEE International Conference on Human Robot Interaction (HRI) 2024

Via

Access Paper or Ask Questions

Transition State Clustering for Interaction Segmentation and Learning

Feb 22, 2024

Fabian Hahne, Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Ruth Maria Stock-Homburg, Jan Peters, Georgia Chalvatzaki

Abstract:Hidden Markov Models with an underlying Mixture of Gaussian structure have proven effective in learning Human-Robot Interactions from demonstrations for various interactive tasks via Gaussian Mixture Regression. However, a mismatch occurs when segmenting the interaction using only the observed state of the human compared to the joint state of the human and the robot. To enhance this underlying segmentation and subsequently the predictive abilities of such Gaussian Mixture-based approaches, we take a hierarchical approach by learning an additional mixture distribution on the states at the transition boundary. This helps prevent misclassifications that usually occur in such states. We find that our framework improves the performance of the underlying Gaussian Mixture-based approach, which we evaluate on various interactive tasks such as handshaking and fistbumps.

* Accepted as a Late Breaking Report in The ACM/IEEE International Conference on Human Robot Interaction (HRI) 2024

Via

Access Paper or Ask Questions

Learning Multimodal Latent Dynamics for Human-Robot Interaction

Nov 27, 2023

Vignesh Prasad, Lea Heitlinger, Dorothea Koert, Ruth Stock-Homburg, Jan Peters, Georgia Chalvatzaki

Abstract:This article presents a method for learning well-coordinated Human-Robot Interaction (HRI) from Human-Human Interactions (HHI). We devise a hybrid approach using Hidden Markov Models (HMMs) as the latent space priors for a Variational Autoencoder to model a joint distribution over the interacting agents. We leverage the interaction dynamics learned from HHI to learn HRI and incorporate the conditional generation of robot motions from human observations into the training, thereby predicting more accurate robot trajectories. The generated robot motions are further adapted with Inverse Kinematics to ensure the desired physical proximity with a human, combining the ease of joint space learning and accurate task space reachability. For contact-rich interactions, we modulate the robot's stiffness using HMM segmentation for a compliant interaction. We verify the effectiveness of our approach deployed on a Humanoid robot via a user study. Our method generalizes well to various humans despite being trained on data from just two humans. We find that Users perceive our method as more human-like, timely, and accurate and rank our method with a higher degree of preference over other baselines.

* 20 Pages, 10 Figures

Via

Access Paper or Ask Questions

Learning Human-like Hand Reaching for Human-Robot Handshaking

Feb 28, 2021

Vignesh Prasad, Ruth Stock-Homburg, Jan Peters

Figure 1 for Learning Human-like Hand Reaching for Human-Robot Handshaking

Figure 2 for Learning Human-like Hand Reaching for Human-Robot Handshaking

Figure 3 for Learning Human-like Hand Reaching for Human-Robot Handshaking

Figure 4 for Learning Human-like Hand Reaching for Human-Robot Handshaking

Abstract:One of the first and foremost non-verbal interactions that humans perform is a handshake. It has an impact on first impressions as touch can convey complex emotions. This makes handshaking an important skill for the repertoire of a social robot. In this paper, we present a novel framework for learning human-robot handshaking behaviours for humanoid robots solely using third-person human-human interaction data. This is especially useful for non-backdrivable robots that cannot be taught by demonstrations via kinesthetic teaching. Our approach can be easily executed on different humanoid robots. This removes the need for re-training, which is especially tedious when training with human-interaction partners. We show this by applying the learnt behaviours on two different humanoid robots with similar degrees of freedom but different shapes and control limits.

* Pre-print version. Accepted for Publication at ICRA'21

Via

Access Paper or Ask Questions

Human-Robot Handshaking: A Review

Feb 14, 2021

Vignesh Prasad, Ruth Stock-Homburg, Jan Peters

Figure 1 for Human-Robot Handshaking: A Review

Figure 2 for Human-Robot Handshaking: A Review

Figure 3 for Human-Robot Handshaking: A Review

Figure 4 for Human-Robot Handshaking: A Review

Abstract:For some years now, the use of social, anthropomorphic robots in various situations has been on the rise. These are robots developed to interact with humans and are equipped with corresponding extremities. They already support human users in various industries, such as retail, gastronomy, hotels, education and healthcare. During such Human-Robot Interaction (HRI) scenarios, physical touch plays a central role in the various applications of social robots as interactive non-verbal behaviour is a key factor in making the interaction more natural. Shaking hands is a simple, natural interaction used commonly in many social contexts and is seen as a symbol of greeting, farewell and congratulations. In this paper, we take a look at the existing state of Human-Robot Handshaking research, categorise the works based on their focus areas, draw out the major findings of these areas while analysing their pitfalls. We mainly see that some form of synchronisation exists during the different phases of the interaction. In addition to this, we also find that additional factors like gaze, voice facial expressions etc. can affect the perception of a robotic handshake and that internal factors like personality and mood can affect the way in which handshaking behaviours are executed by humans. Based on the findings and insights, we finally discuss possible ways forward for research on such physically interactive behaviours.

* Pre-print version. Accepted for publication in the International Journal of Social Robotics

Via

Access Paper or Ask Questions

Advances in Human-Robot Handshaking

Aug 26, 2020

Vignesh Prasad, Ruth Stock-Homburg, Jan Peters

Figure 1 for Advances in Human-Robot Handshaking

Abstract:The use of social, anthropomorphic robots to support humans in various industries has been on the rise. During Human-Robot Interaction (HRI), physically interactive non-verbal behaviour is key for more natural interactions. Handshaking is one such natural interaction used commonly in many social contexts. It is one of the first non-verbal interactions which takes place and should, therefore, be part of the repertoire of a social robot. In this paper, we explore the existing state of Human-Robot Handshaking and discuss possible ways forward for such physically interactive behaviours.

* Accepted at The 12th International Conference on Social Robotics (ICSR 2020) 12 Pages, 1 Figure

Via

Access Paper or Ask Questions