Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Beetz

Generating Actionable Robot Knowledge Bases by Combining 3D Scene Graphs with Robot Ontologies

Jul 15, 2025

Giang Nguyen, Mihai Pomarlan, Sascha Jongebloed, Nils Leusmann, Minh Nhat Vu, Michael Beetz

Abstract:In robotics, the effective integration of environmental data into actionable knowledge remains a significant challenge due to the variety and incompatibility of data formats commonly used in scene descriptions, such as MJCF, URDF, and SDF. This paper presents a novel approach that addresses these challenges by developing a unified scene graph model that standardizes these varied formats into the Universal Scene Description (USD) format. This standardization facilitates the integration of these scene graphs with robot ontologies through semantic reporting, enabling the translation of complex environmental data into actionable knowledge essential for cognitive robotic control. We evaluated our approach by converting procedural 3D environments into USD format, which is then annotated semantically and translated into a knowledge graph to effectively answer competency questions, demonstrating its utility for real-time robotic decision-making. Additionally, we developed a web-based visualization tool to support the semantic mapping process, providing users with an intuitive interface to manage the 3D environment.

* 8 pages, 7 figures, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2025)

Via

Access Paper or Ask Questions

NOCTIS: Novel Object Cyclic Threshold based Instance Segmentation

Jul 02, 2025

Max Gandyra, Alessandro Santonicola, Michael Beetz

Abstract:Instance segmentation of novel objects instances in RGB images, given some example images for each object, is a well known problem in computer vision. Designing a model general enough to be employed, for all kinds of novel objects, without (re-) training, has proven to be a difficult task. To handle this, we propose a simple, yet powerful, framework, called: Novel Object Cyclic Threshold based Instance Segmentation (NOCTIS). This work stems from and improves upon previous ones like CNOS, SAM-6D and NIDS-Net; thus, it also leverages on recent vision foundation models, namely: Grounded-SAM 2 and DINOv2. It utilises Grounded-SAM 2 to obtain object proposals with precise bounding boxes and their corresponding segmentation masks; while DINOv2's zero-shot capabilities are employed to generate the image embeddings. The quality of those masks, together with their embeddings, is of vital importance to our approach; as the proposal-object matching is realized by determining an object matching score based on the similarity of the class embeddings and the average maximum similarity of the patch embeddings. Differently to SAM-6D, calculating the latter involves a prior patch filtering based on the distance between each patch and its corresponding cyclic/roundtrip patch in the image grid. Furthermore, the average confidence of the proposals' bounding box and mask is used as an additional weighting factor for the object matching score. We empirically show that NOCTIS, without further training/fine tuning, outperforms the best RGB and RGB-D methods on the seven core datasets of the BOP 2023 challenge for the "Model-based 2D segmentation of unseen objects" task.

* 10 pages, 3 figures, 3 tables, NeurIPS 2025 preprint

Via

Access Paper or Ask Questions

Digital Twin Generation from Visual Data: A Survey

Apr 17, 2025

Andrew Melnik, Benjamin Alt, Giang Nguyen, Artur Wilkowski, Maciej Stefańczyk, Qirui Wu, Sinan Harms, Helge Rhodin, Manolis Savva, Michael Beetz

Abstract:This survey explores recent developments in generating digital twins from videos. Such digital twins can be used for robotics application, media content creation, or design and construction works. We analyze various approaches, including 3D Gaussian Splatting, generative in-painting, semantic segmentation, and foundation models highlighting their advantages and limitations. Additionally, we discuss challenges such as occlusions, lighting variations, and scalability, as well as potential future research directions. This survey aims to provide a comprehensive overview of state-of-the-art methodologies and their implications for real-world applications. Awesome list: https://github.com/ndrwmlnk/awesome-digital-twins

Via

Access Paper or Ask Questions

Towards a cognitive architecture to enable natural language interaction in co-constructive task learning

Mar 31, 2025

Manuel Scheibl, Birte Richter, Alissa Müller, Michael Beetz, Britta Wrede

Abstract:This research addresses the question, which characteristics a cognitive architecture must have to leverage the benefits of natural language in Co-Constructive Task Learning (CCTL). To provide context, we first discuss Interactive Task Learning (ITL), the mechanisms of the human memory system, and the significance of natural language and multi-modality. Next, we examine the current state of cognitive architectures, analyzing their capabilities to inform a concept of CCTL grounded in multiple sources. We then integrate insights from various research domains to develop a unified framework. Finally, we conclude by identifying the remaining challenges and requirements necessary to achieve CCTL in Human-Robot Interaction (HRI).

* 8 pages, 5 figures, submitted to: IEEE RO-MAN 2025

Via

Access Paper or Ask Questions

A Framework for Adapting Human-Robot Interaction to Diverse User Groups

Oct 15, 2024

Theresa Pekarek Rosin, Vanessa Hassouna, Xiaowen Sun, Luca Krohm, Henri-Leon Kordt, Michael Beetz, Stefan Wermter

Figure 1 for A Framework for Adapting Human-Robot Interaction to Diverse User Groups

Figure 2 for A Framework for Adapting Human-Robot Interaction to Diverse User Groups

Figure 3 for A Framework for Adapting Human-Robot Interaction to Diverse User Groups

Figure 4 for A Framework for Adapting Human-Robot Interaction to Diverse User Groups

Abstract:To facilitate natural and intuitive interactions with diverse user groups in real-world settings, social robots must be capable of addressing the varying requirements and expectations of these groups while adapting their behavior based on user feedback. While previous research often focuses on specific demographics, we present a novel framework for adaptive Human-Robot Interaction (HRI) that tailors interactions to different user groups and enables individual users to modulate interactions through both minor and major interruptions. Our primary contributions include the development of an adaptive, ROS-based HRI framework with an open-source code base. This framework supports natural interactions through advanced speech recognition and voice activity detection, and leverages a large language model (LLM) as a dialogue bridge. We validate the efficiency of our framework through module tests and system trials, demonstrating its high accuracy in age recognition and its robustness to repeated user inputs and plan changes.

* Accepted at the 16th International Conference on Social Robotics (ICSR) 2024

Via

Access Paper or Ask Questions

Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Sep 13, 2024

Benjamin Alt, Claudius Kienle, Darko Katic, Rainer Jäkel, Michael Beetz

Figure 1 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Figure 2 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Figure 3 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Figure 4 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Abstract:This paper presents SPI-DP, a novel first-order optimizer capable of optimizing robot programs with respect to both high-level task objectives and motion-level constraints. To that end, we introduce DGPMP2-ND, a differentiable collision-free motion planner for serial N-DoF kinematics, and integrate it into an iterative, gradient-based optimization approach for generic, parameterized robot program representations. SPI-DP allows first-order optimization of planned trajectories and program parameters with respect to objectives such as cycle time or smoothness subject to e.g. collision constraints, while enabling humans to understand, modify or even certify the optimized programs. We provide a comprehensive evaluation on two practical household and industrial applications.

* 8 pages, 6 figures, submitted to the 2025 IEEE International Conference on Robotics & Automation (ICRA)

Via

Access Paper or Ask Questions

MARLIN: A Cloud Integrated Robotic Solution to Support Intralogistics in Retail

Jul 02, 2024

Dennis Mronga, Andreas Bresser, Fabian Maas, Adrian Danzglock, Simon Stelter, Alina Hawkin, Hoang Giang Nguyen, Michael Beetz, Frank Kirchner

Abstract:In this paper, we present the service robot MARLIN and its integration with the K4R platform, a cloud system for complex AI applications in retail. At its core, this platform contains so-called semantic digital twins, a semantically annotated representation of the retail store. MARLIN continuously exchanges data with the K4R platform, improving the robot's capabilities in perception, autonomous navigation, and task planning. We exploit these capabilities in a retail intralogistics scenario, specifically by assisting store employees in stocking shelves. We demonstrate that MARLIN is able to update the digital representation of the retail store by detecting and classifying obstacles, autonomously planning and executing replenishment missions, adapting to unforeseen changes in the environment, and interacting with store employees. Experiments are conducted in simulation, in a laboratory environment, and in a real store. We also describe and evaluate a novel algorithm for autonomous navigation of articulated tractor-trailer systems. The algorithm outperforms the manufacturer's proprietary navigation approach and improves MARLIN's navigation capabilities in confined spaces.

* MARLIN: A cloud integrated robotic solution to support intralogistics in retail, Robotics and Autonomous Systems, Volume 175, 2024

Via

Access Paper or Ask Questions

Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Apr 30, 2024

Benjamin Alt, Johannes Zahn, Claudius Kienle, Julia Dvorak, Marvin May, Darko Katic, Rainer Jäkel, Tobias Kopp, Michael Beetz, Gisela Lanza

Figure 1 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Figure 2 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Figure 3 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Figure 4 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Abstract:While recent advances in deep learning have demonstrated its transformative potential, its adoption for real-world manufacturing applications remains limited. We present an Explanation User Interface (XUI) for a state-of-the-art deep learning-based robot program optimizer which provides both naive and expert users with different user experiences depending on their skill level, as well as Explainable AI (XAI) features to facilitate the application of deep learning methods in real-world applications. To evaluate the impact of the XUI on task performance, user satisfaction and cognitive load, we present the results of a preliminary user survey and propose a study design for a large-scale follow-up study.

* 6 pages, 4 figures, accepted at the 2024 CIRP International Conference on Manufacturing Systems (CMS)

Via

Access Paper or Ask Questions

BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Apr 21, 2024

Benjamin Alt, Julia Dvorak, Darko Katic, Rainer Jäkel, Michael Beetz, Gisela Lanza

Figure 1 for BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Figure 2 for BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Figure 3 for BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Abstract:Over the past decade, deep learning helped solve manipulation problems across all domains of robotics. At the same time, industrial robots continue to be programmed overwhelmingly using traditional program representations and interfaces. This paper undertakes an analysis of this "AI adoption gap" from an industry practitioner's perspective. In response, we propose the BANSAI approach (Bridging the AI Adoption Gap via Neurosymbolic AI). It systematically leverages principles of neurosymbolic AI to establish data-driven, subsymbolic program synthesis and optimization in modern industrial robot programming workflow. BANSAI conceptually unites several lines of prior research and proposes a path toward practical, real-world validation.

* 6 pages, 3 figures, accepted at the 2024 CIRP International Conference on Manufacturing Systems (CMS)

Via

Access Paper or Ask Questions

Cloud-based Digital Twin for Cognitive Robotics

Apr 19, 2024

Arthur Niedźwiecki, Sascha Jongebloed, Yanxiang Zhan, Michaela Kümpel, Jörn Syrbe, Michael Beetz

Abstract:The paper presents a novel cloud-based digital twin learning platform for teaching and training concepts of cognitive robotics. Instead of forcing interested learners or students to install a new operating system and bulky, fragile software onto their personal laptops just to solve tutorials or coding assignments of a single lecture on robotics, it would be beneficial to avoid technical setups and directly dive into the content of cognitive robotics. To achieve this, the authors utilize containerization technologies and Kubernetes to deploy and operate containerized applications, including robotics simulation environments and software collections based on the Robot operating System (ROS). The web-based Integrated Development Environment JupyterLab is integrated with RvizWeb and XPRA to provide real-time visualization of sensor data and robot behavior in a user-friendly environment for interacting with robotics software. The paper also discusses the application of the platform in teaching Knowledge Representation, Reasoning, Acquisition and Retrieval, and Task-Executives. The authors conclude that the proposed platform is a valuable tool for education and research in cognitive robotics, and that it has the potential to democratize access to these fields. The platform has already been successfully employed in various academic courses, demonstrating its effectiveness in fostering knowledge and skill development.

* 5 pages, IEEE Global Engineering Education Conference (EDUCON)

Via

Access Paper or Ask Questions