Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Sliwowski

REASSEMBLE: A Multimodal Dataset for Contact-rich Robotic Assembly and Disassembly

Feb 07, 2025

Daniel Sliwowski, Shail Jadav, Sergej Stanovcic, Jedrzej Orbik, Johannes Heidersberger, Dongheui Lee

Abstract:Robotic manipulation remains a core challenge in robotics, particularly for contact-rich tasks such as industrial assembly and disassembly. Existing datasets have significantly advanced learning in manipulation but are primarily focused on simpler tasks like object rearrangement, falling short of capturing the complexity and physical dynamics involved in assembly and disassembly. To bridge this gap, we present REASSEMBLE (Robotic assEmbly disASSEMBLy datasEt), a new dataset designed specifically for contact-rich manipulation tasks. Built around the NIST Assembly Task Board 1 benchmark, REASSEMBLE includes four actions (pick, insert, remove, and place) involving 17 objects. The dataset contains 4,551 demonstrations, of which 4,035 were successful, spanning a total of 781 minutes. Our dataset features multi-modal sensor data including event cameras, force-torque sensors, microphones, and multi-view RGB cameras. This diverse dataset supports research in areas such as learning contact-rich manipulation, task condition identification, action segmentation, and more. We believe REASSEMBLE will be a valuable resource for advancing robotic manipulation in complex, real-world scenarios. The dataset is publicly available on our project website: https://dsliwowski1.github.io/REASSEMBLE_page.

* 16 pages, 12 figures, 1 table

Via

Access Paper or Ask Questions

ConditionNET: Learning Preconditions and Effects for Execution Monitoring

Feb 03, 2025

Daniel Sliwowski, Dongheui Lee

Abstract:The introduction of robots into everyday scenarios necessitates algorithms capable of monitoring the execution of tasks. In this paper, we propose ConditionNET, an approach for learning the preconditions and effects of actions in a fully data-driven manner. We develop an efficient vision-language model and introduce additional optimization objectives during training to optimize for consistent feature representations. ConditionNET explicitly models the dependencies between actions, preconditions, and effects, leading to improved performance. We evaluate our model on two robotic datasets, one of which we collected for this paper, containing 406 successful and 138 failed teleoperated demonstrations of a Franka Emika Panda robot performing tasks like pouring and cleaning the counter. We show in our experiments that ConditionNET outperforms all baselines on both anomaly detection and phase prediction tasks. Furthermore, we implement an action monitoring system on a real robot to demonstrate the practical applicability of the learned preconditions and effects. Our results highlight the potential of ConditionNET for enhancing the reliability and adaptability of robots in real-world environments. The data is available on the project website: https://dsliwowski1.github.io/ConditionNET_page.

* in IEEE Robotics and Automation Letters, vol. 10, no. 2, pp. 1337-1344, Feb. 2025
* 9 pages, 5 figures, 3 tables

Via

Access Paper or Ask Questions

HOI4ABOT: Human-Object Interaction Anticipation for Human Intention Reading Collaborative roBOTs

Sep 28, 2023

Esteve Valls Mascaro, Daniel Sliwowski, Dongheui Lee

Figure 1 for HOI4ABOT: Human-Object Interaction Anticipation for Human Intention Reading Collaborative roBOTs

Figure 2 for HOI4ABOT: Human-Object Interaction Anticipation for Human Intention Reading Collaborative roBOTs

Figure 3 for HOI4ABOT: Human-Object Interaction Anticipation for Human Intention Reading Collaborative roBOTs

Figure 4 for HOI4ABOT: Human-Object Interaction Anticipation for Human Intention Reading Collaborative roBOTs

Abstract:Robots are becoming increasingly integrated into our lives, assisting us in various tasks. To ensure effective collaboration between humans and robots, it is essential that they understand our intentions and anticipate our actions. In this paper, we propose a Human-Object Interaction (HOI) anticipation framework for collaborative robots. We propose an efficient and robust transformer-based model to detect and anticipate HOIs from videos. This enhanced anticipation empowers robots to proactively assist humans, resulting in more efficient and intuitive collaborations. Our model outperforms state-of-the-art results in HOI detection and anticipation in VidHOI dataset with an increase of 1.76% and 1.04% in mAP respectively while being 15.4 times faster. We showcase the effectiveness of our approach through experimental results in a real robot, demonstrating that the robot's ability to anticipate HOIs is key for better Human-Robot Interaction. More information can be found on our project webpage: https://evm7.github.io/HOI4ABOT_page/

* Proceedings in Conference on Robot Learning 2023

Via

Access Paper or Ask Questions