Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Benjamin Alt

Digital Twin Generation from Visual Data: A Survey

Apr 17, 2025

Andrew Melnik, Benjamin Alt, Giang Nguyen, Artur Wilkowski, Maciej Stefańczyk, Qirui Wu, Sinan Harms, Helge Rhodin, Manolis Savva, Michael Beetz

Abstract:This survey explores recent developments in generating digital twins from videos. Such digital twins can be used for robotics application, media content creation, or design and construction works. We analyze various approaches, including 3D Gaussian Splatting, generative in-painting, semantic segmentation, and foundation models highlighting their advantages and limitations. Additionally, we discuss challenges such as occlusions, lighting variations, and scalability, as well as potential future research directions. This survey aims to provide a comprehensive overview of state-of-the-art methodologies and their implications for real-world applications. Awesome list: https://github.com/ndrwmlnk/awesome-digital-twins

Via

Access Paper or Ask Questions

AI-based Framework for Robust Model-Based Connector Mating in Robotic Wire Harness Installation

Mar 12, 2025

Claudius Kienle, Benjamin Alt, Finn Schneider, Tobias Pertlwieser, Rainer Jäkel, Rania Rayyes

Abstract:Despite the widespread adoption of industrial robots in automotive assembly, wire harness installation remains a largely manual process, as it requires precise and flexible manipulation. To address this challenge, we design a novel AI-based framework that automates cable connector mating by integrating force control with deep visuotactile learning. Our system optimizes search-and-insertion strategies using first-order optimization over a multimodal transformer architecture trained on visual, tactile, and proprioceptive data. Additionally, we design a novel automated data collection and optimization pipeline that minimizes the need for machine learning expertise. The framework optimizes robot programs that run natively on standard industrial controllers, permitting human experts to audit and certify them. Experimental validations on a center console assembly task demonstrate significant improvements in cycle times and robustness compared to conventional robot programming approaches. Videos are available under https://claudius-kienle.github.io/AppMuTT.

* 6 pages, 6 figures, 4 tables, submitted to the 2025 IEEE 21st International Conference on Automation Science and Engineering

Via

Access Paper or Ask Questions

QueryCAD: Grounded Question Answering for CAD Models

Sep 16, 2024

Claudius Kienle, Benjamin Alt, Darko Katic, Rainer Jäkel

Figure 1 for QueryCAD: Grounded Question Answering for CAD Models

Figure 2 for QueryCAD: Grounded Question Answering for CAD Models

Figure 3 for QueryCAD: Grounded Question Answering for CAD Models

Figure 4 for QueryCAD: Grounded Question Answering for CAD Models

Abstract:CAD models are widely used in industry and are essential for robotic automation processes. However, these models are rarely considered in novel AI-based approaches, such as the automatic synthesis of robot programs, as there are no readily available methods that would allow CAD models to be incorporated for the analysis, interpretation, or extraction of information. To address these limitations, we propose QueryCAD, the first system designed for CAD question answering, enabling the extraction of precise information from CAD models using natural language queries. QueryCAD incorporates SegCAD, an open-vocabulary instance segmentation model we developed to identify and select specific parts of the CAD model based on part descriptions. We further propose a CAD question answering benchmark to evaluate QueryCAD and establish a foundation for future research. Lastly, we integrate QueryCAD within an automatic robot program synthesis framework, validating its ability to enhance deep-learning solutions for robotics by enabling them to process CAD models (https://claudius-kienle.github.com/querycad).

Via

Access Paper or Ask Questions

Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Sep 13, 2024

Benjamin Alt, Claudius Kienle, Darko Katic, Rainer Jäkel, Michael Beetz

Figure 1 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Figure 2 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Figure 3 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Figure 4 for Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

Abstract:This paper presents SPI-DP, a novel first-order optimizer capable of optimizing robot programs with respect to both high-level task objectives and motion-level constraints. To that end, we introduce DGPMP2-ND, a differentiable collision-free motion planner for serial N-DoF kinematics, and integrate it into an iterative, gradient-based optimization approach for generic, parameterized robot program representations. SPI-DP allows first-order optimization of planned trajectories and program parameters with respect to objectives such as cycle time or smoothness subject to e.g. collision constraints, while enabling humans to understand, modify or even certify the optimized programs. We provide a comprehensive evaluation on two practical household and industrial applications.

* 8 pages, 6 figures, submitted to the 2025 IEEE International Conference on Robotics & Automation (ICRA)

Via

Access Paper or Ask Questions

MuTT: A Multimodal Trajectory Transformer for Robot Skills

Jul 22, 2024

Claudius Kienle, Benjamin Alt, Onur Celik, Philipp Becker, Darko Katic, Rainer Jäkel, Gerhard Neumann

Figure 1 for MuTT: A Multimodal Trajectory Transformer for Robot Skills

Figure 2 for MuTT: A Multimodal Trajectory Transformer for Robot Skills

Figure 3 for MuTT: A Multimodal Trajectory Transformer for Robot Skills

Figure 4 for MuTT: A Multimodal Trajectory Transformer for Robot Skills

Abstract:High-level robot skills represent an increasingly popular paradigm in robot programming. However, configuring the skills' parameters for a specific task remains a manual and time-consuming endeavor. Existing approaches for learning or optimizing these parameters often require numerous real-world executions or do not work in dynamic environments. To address these challenges, we propose MuTT, a novel encoder-decoder transformer architecture designed to predict environment-aware executions of robot skills by integrating vision, trajectory, and robot skill parameters. Notably, we pioneer the fusion of vision and trajectory, introducing a novel trajectory projection. Furthermore, we illustrate MuTT's efficacy as a predictor when combined with a model-based robot skill optimizer. This approach facilitates the optimization of robot skill parameters for the current environment, without the need for real-world executions during optimization. Designed for compatibility with any representation of robot skills, MuTT demonstrates its versatility across three comprehensive experiments, showcasing superior performance across two different skill representations.

Via

Access Paper or Ask Questions

Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Apr 30, 2024

Benjamin Alt, Johannes Zahn, Claudius Kienle, Julia Dvorak, Marvin May, Darko Katic, Rainer Jäkel, Tobias Kopp, Michael Beetz, Gisela Lanza

Figure 1 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Figure 2 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Figure 3 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Figure 4 for Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization

Abstract:While recent advances in deep learning have demonstrated its transformative potential, its adoption for real-world manufacturing applications remains limited. We present an Explanation User Interface (XUI) for a state-of-the-art deep learning-based robot program optimizer which provides both naive and expert users with different user experiences depending on their skill level, as well as Explainable AI (XAI) features to facilitate the application of deep learning methods in real-world applications. To evaluate the impact of the XUI on task performance, user satisfaction and cognitive load, we present the results of a preliminary user survey and propose a study design for a large-scale follow-up study.

* 6 pages, 4 figures, accepted at the 2024 CIRP International Conference on Manufacturing Systems (CMS)

Via

Access Paper or Ask Questions

BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Apr 21, 2024

Benjamin Alt, Julia Dvorak, Darko Katic, Rainer Jäkel, Michael Beetz, Gisela Lanza

Figure 1 for BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Figure 2 for BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Figure 3 for BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming

Abstract:Over the past decade, deep learning helped solve manipulation problems across all domains of robotics. At the same time, industrial robots continue to be programmed overwhelmingly using traditional program representations and interfaces. This paper undertakes an analysis of this "AI adoption gap" from an industry practitioner's perspective. In response, we propose the BANSAI approach (Bridging the AI Adoption Gap via Neurosymbolic AI). It systematically leverages principles of neurosymbolic AI to establish data-driven, subsymbolic program synthesis and optimization in modern industrial robot programming workflow. BANSAI conceptually unites several lines of prior research and proposes a path toward practical, real-world validation.

* 6 pages, 3 figures, accepted at the 2024 CIRP International Conference on Manufacturing Systems (CMS)

Via

Access Paper or Ask Questions

RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

Feb 27, 2024

Benjamin Alt, Florian Stöckl, Silvan Müller, Christopher Braun, Julian Raible, Saad Alhasan, Oliver Rettig, Lukas Ringle, Darko Katic, Rainer Jäkel(+3 more)

Figure 1 for RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

Figure 2 for RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

Figure 3 for RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

Figure 4 for RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

Abstract:Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identification, an interactive voice-controlled wizard system for the AI-assisted bootstrapping and parameterization of robot programs, and an automatic planning and execution pipeline for force-controlled robotic surface treatment. RoboGrind is evaluated both under laboratory and real-world conditions in the context of refabricating fiberglass wind turbine blades.

* 7 pages, 6 figures, accepted to the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

Via

Access Paper or Ask Questions

Domain-Specific Fine-Tuning of Large Language Models for Interactive Robot Programming

Dec 21, 2023

Benjamin Alt, Urs Keßner, Aleksandar Taranovic, Darko Katic, Andreas Hermann, Rainer Jäkel, Gerhard Neumann

Figure 1 for Domain-Specific Fine-Tuning of Large Language Models for Interactive Robot Programming

Abstract:Industrial robots are applied in a widening range of industries, but robot programming mostly remains a task limited to programming experts. We propose a natural language-based assistant for programming of advanced, industrial robotic applications and investigate strategies for domain-specific fine-tuning of foundation models with limited data and compute.

* 5 pages, 1 figure, accepted to the 2024 European Robotics Forum

Via

Access Paper or Ask Questions

EfficientPPS: Part-aware Panoptic Segmentation of Transparent Objects for Robotic Manipulation

Dec 21, 2023

Benjamin Alt, Minh Dang Nguyen, Andreas Hermann, Darko Katic, Rainer Jäkel, Rüdiger Dillmann, Eric Sax

Abstract:The use of autonomous robots for assistance tasks in hospitals has the potential to free up qualified staff and im-prove patient care. However, the ubiquity of deformable and transparent objects in hospital settings poses signif-icant challenges to vision-based perception systems. We present EfficientPPS, a neural architecture for part-aware panoptic segmentation that provides robots with semantically rich visual information for grasping and ma-nipulation tasks. We also present an unsupervised data collection and labelling method to reduce the need for human involvement in the training process. EfficientPPS is evaluated on a dataset containing real-world hospital objects and demonstrated to be robust and efficient in grasping transparent transfusion bags with a collaborative robot arm.

* ISR Europe 2023
* 8 pages, 8 figures, presented at the 56th International Symposium on Robotics (ISR Europe)

Via

Access Paper or Ask Questions