Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kostas Bekris

Fully Packed and Ready to Go: High-Density, Rearrangement-Free, Grid-Based Storage and Retrieval

May 28, 2025

Tzvika Geft, Kostas Bekris, Jingjin Yu

Abstract:Grid-based storage systems with uniformly shaped loads (e.g., containers, pallets, totes) are commonplace in logistics, industrial, and transportation domains. A key performance metric for such systems is the maximization of space utilization, which requires some loads to be placed behind or below others, preventing direct access to them. Consequently, dense storage settings bring up the challenge of determining how to place loads while minimizing costly rearrangement efforts necessary during retrieval. This paper considers the setting involving an inbound phase, during which loads arrive, followed by an outbound phase, during which loads depart. The setting is prevalent in distribution centers, automated parking garages, and container ports. In both phases, minimizing the number of rearrangement actions results in more optimal (e.g., fast, energy-efficient, etc.) operations. In contrast to previous work focusing on stack-based systems, this effort examines the case where loads can be freely moved along the grid, e.g., by a mobile robot, expanding the range of possible motions. We establish that for a range of scenarios, such as having limited prior knowledge of the loads' arrival sequences or grids with a narrow opening, a (best possible) rearrangement-free solution always exists, including when the loads fill the grid to its capacity. In particular, when the sequences are fully known, we establish an intriguing characterization showing that rearrangement can always be avoided if and only if the open side of the grid (used to access the storage) is at least 3 cells wide. We further discuss useful practical implications of our solutions.

Via

Access Paper or Ask Questions

Developing Modular Grasping and Manipulation Pipeline Infrastructure to Streamline Performance Benchmarking

Apr 09, 2025

Brian Flynn, Kostas Bekris, Berk Calli, Aaron Dollar, Adam Norton, Yu Sun, Holly Yanco

Abstract:The robot manipulation ecosystem currently faces issues with integrating open-source components and reproducing results. This limits the ability of the community to benchmark and compare the performance of different solutions to one another in an effective manner, instead relying on largely holistic evaluations. As part of the COMPARE Ecosystem project, we are developing modular grasping and manipulation pipeline infrastructure in order to streamline performance benchmarking. The infrastructure will be used towards the establishment of standards and guidelines for modularity and improved open-source development and benchmarking. This paper provides a high-level overview of the architecture of the pipeline infrastructure, experiments conducted to exercise it during development, and future work to expand its modularity.

* IEEE International Conference on Robotics and Automation (ICRA) 2025, Workshop on Robot Software Architectures (RSA25), Atlanta, Georgia, USA, May 2025

Via

Access Paper or Ask Questions

Impact-resistant, autonomous robots inspired by tensegrity architecture

Jan 25, 2025

William R. Johnson III, Xiaonan Huang, Shiyang Lu, Kun Wang, Joran W. Booth, Kostas Bekris, Rebecca Kramer-Bottiglio

Figure 1 for Impact-resistant, autonomous robots inspired by tensegrity architecture

Figure 2 for Impact-resistant, autonomous robots inspired by tensegrity architecture

Figure 3 for Impact-resistant, autonomous robots inspired by tensegrity architecture

Figure 4 for Impact-resistant, autonomous robots inspired by tensegrity architecture

Abstract:Future robots will navigate perilous, remote environments with resilience and autonomy. Researchers have proposed building robots with compliant bodies to enhance robustness, but this approach often sacrifices the autonomous capabilities expected of rigid robots. Inspired by tensegrity architecture, we introduce a tensegrity robot -- a hybrid robot made from rigid struts and elastic tendons -- that demonstrates the advantages of compliance and the autonomy necessary for task performance. This robot boasts impact resistance and autonomy in a field environment and additional advances in the state of the art, including surviving harsh impacts from drops (at least 5.7 m), accurately reconstructing its shape and orientation using on-board sensors, achieving high locomotion speeds (18 bar lengths per minute), and climbing the steepest incline of any tensegrity robot (28 degrees). We characterize the robot's locomotion on unstructured terrain, showcase its autonomous capabilities in navigation tasks, and demonstrate its robustness by rolling it off a cliff.

Via

Access Paper or Ask Questions

Learning Differentiable Tensegrity Dynamics using Graph Neural Networks

Oct 16, 2024

Nelson Chen, Kun Wang, William R. Johnson III, Rebecca Kramer-Bottiglio, Kostas Bekris, Mridul Aanjaneya

Figure 1 for Learning Differentiable Tensegrity Dynamics using Graph Neural Networks

Figure 2 for Learning Differentiable Tensegrity Dynamics using Graph Neural Networks

Figure 3 for Learning Differentiable Tensegrity Dynamics using Graph Neural Networks

Figure 4 for Learning Differentiable Tensegrity Dynamics using Graph Neural Networks

Abstract:Tensegrity robots are composed of rigid struts and flexible cables. They constitute an emerging class of hybrid rigid-soft robotic systems and are promising systems for a wide array of applications, ranging from locomotion to assembly. They are difficult to control and model accurately, however, due to their compliance and high number of degrees of freedom. To address this issue, prior work has introduced a differentiable physics engine designed for tensegrity robots based on first principles. In contrast, this work proposes the use of graph neural networks to model contact dynamics over a graph representation of tensegrity robots, which leverages their natural graph-like cable connectivity between end caps of rigid rods. This learned simulator can accurately model 3-bar and 6-bar tensegrity robot dynamics in simulation-to-simulation experiments where MuJoCo is used as the ground truth. It can also achieve higher accuracy than the previous differentiable engine for a real 3-bar tensegrity robot, for which the robot state is only partially observable. When compared against direct applications of recent mesh-based graph neural network simulators, the proposed approach is computationally more efficient, both for training and inference, while achieving higher accuracy. Code and data are available at https://github.com/nchen9191/tensegrity_gnn_simulator_public

Via

Access Paper or Ask Questions

OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data

Nov 06, 2023

Shiyang Lu, Haonan Chang, Eric Pu Jing, Abdeslam Boularias, Kostas Bekris

Abstract:This work presents OVIR-3D, a straightforward yet effective method for open-vocabulary 3D object instance retrieval without using any 3D data for training. Given a language query, the proposed method is able to return a ranked set of 3D object instance segments based on the feature similarity of the instance and the text query. This is achieved by a multi-view fusion of text-aligned 2D region proposals into 3D space, where the 2D region proposal network could leverage 2D datasets, which are more accessible and typically larger than 3D datasets. The proposed fusion process is efficient as it can be performed in real-time for most indoor 3D scenes and does not require additional training in 3D space. Experiments on public datasets and a real robot show the effectiveness of the method and its potential for applications in robot navigation and manipulation.

Via

Access Paper or Ask Questions

Socially Cognizant Robotics for a Technology Enhanced Society

Oct 27, 2023

Kristin J. Dana, Clinton Andrews, Kostas Bekris, Jacob Feldman, Matthew Stone, Pernille Hemmer, Aaron Mazzeo, Hal Salzman, Jingang Yi

Figure 1 for Socially Cognizant Robotics for a Technology Enhanced Society

Figure 2 for Socially Cognizant Robotics for a Technology Enhanced Society

Figure 3 for Socially Cognizant Robotics for a Technology Enhanced Society

Figure 4 for Socially Cognizant Robotics for a Technology Enhanced Society

Abstract:Emerging applications of robotics, and concerns about their impact, require the research community to put human-centric objectives front-and-center. To meet this challenge, we advocate an interdisciplinary approach, socially cognizant robotics, which synthesizes technical and social science methods. We argue that this approach follows from the need to empower stakeholder participation (from synchronous human feedback to asynchronous societal assessment) in shaping AI-driven robot behavior at all levels, and leads to a range of novel research perspectives and problems both for improving robots' interactions with individuals and impacts on society. Drawing on these arguments, we develop best practices for socially cognizant robot design that balance traditional technology-based metrics (e.g. efficiency, precision and accuracy) with critically important, albeit challenging to measure, human and society-based metrics.

Via

Access Paper or Ask Questions

Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Sep 27, 2023

Haonan Chang, Kowndinya Boyalakuntla, Shiyang Lu, Siwei Cai, Eric Jing, Shreesh Keskar, Shijie Geng, Adeeb Abbas, Lifeng Zhou, Kostas Bekris(+1 more)

Figure 1 for Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Figure 2 for Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Figure 3 for Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Figure 4 for Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Abstract:We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a sofa on which someone is sitting". In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments.

* The code and dataset used for evaluation can be found at https://github.com/changhaonan/OVSG}{https://github.com/changhaonan/OVSG. This paper has been accepted by CoRL2023

Via

Access Paper or Ask Questions

Pick Planning Strategies for Large-Scale Package Manipulation

Sep 23, 2023

Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Sicong Szhao, Charles Swan, Kostas Bekris

Figure 1 for Pick Planning Strategies for Large-Scale Package Manipulation

Figure 2 for Pick Planning Strategies for Large-Scale Package Manipulation

Figure 3 for Pick Planning Strategies for Large-Scale Package Manipulation

Figure 4 for Pick Planning Strategies for Large-Scale Package Manipulation

Abstract:Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to market fluctuations. This extended abstract showcases a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which is used for picking and singulating up to 6 million packages per day and so far has manipulated over 2 billion packages. It describes the various heuristic methods developed over time and their successor, which utilizes a pick success predictor trained on real production data. To the best of the authors' knowledge, this work is the first large-scale deployment of learned pick quality estimation methods in a real production system.

* 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Learning Meets Model-based Methods for Manipulation and Grasping Workshop. arXiv admin note: substantial text overlap with arXiv:2305.10272

Via

Access Paper or Ask Questions

Large-Scale Package Manipulation via Learned Metrics of Pick Success

May 17, 2023

Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Charles Swan, Kostas Bekris

Figure 1 for Large-Scale Package Manipulation via Learned Metrics of Pick Success

Figure 2 for Large-Scale Package Manipulation via Learned Metrics of Pick Success

Figure 3 for Large-Scale Package Manipulation via Learned Metrics of Pick Success

Figure 4 for Large-Scale Package Manipulation via Learned Metrics of Pick Success

Abstract:Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to workforce fluctuations. The past few years have seen increased interest in automating such repeated tasks but mostly in controlled settings. Tasks such as picking objects from unstructured, cluttered piles have only recently become robust enough for large-scale deployment with minimal human intervention. This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data. Specifically, the system was trained on over 394K picks. It is used for singulating up to 5~million packages per day and has manipulated over 200~million packages during this paper's evaluation period. The developed learned pick quality measure ranks various pick alternatives in real-time and prioritizes the most promising ones for execution. The pick success predictor aims to estimate from prior experience the success probability of a desired pick by the deployed industrial robotic arms in cluttered scenes containing deformable and rigid objects with partially known properties. It is a shallow machine learning model, which allows us to evaluate which features are most important for the prediction. An online pick ranker leverages the learned success predictor to prioritize the most promising picks for the robotic arm, which are then assessed for collision avoidance. This learned ranking process is demonstrated to overcome the limitations and outperform the performance of manually engineered and heuristic alternatives. To the best of the authors' knowledge, this paper presents the first large-scale deployment of learned pick quality estimation methods in a real production system.

* Accepted at Robotics: Science and Systems (RSS 2023) conference, July 10 - 14, 2023, Daegu, Republic of Korea

Via

Access Paper or Ask Questions

Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

Apr 09, 2023

Shiyang Lu, Yunfu Deng, Abdeslam Boularias, Kostas Bekris

Figure 1 for Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

Figure 2 for Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

Figure 3 for Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

Figure 4 for Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos

Abstract:This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot. A key feature of the self-supervised training process is a graph-matching algorithm that operates on the over-segmentation output of the point cloud that is reconstructed from each video. The graph matching, along with point cloud registration, is able to find reoccurring object patterns across videos and combine them into 3D object pseudo labels, even under occlusions or different viewing angles. Projected 2D object masks from 3D pseudo labels are used to train a pixel-wise feature extractor through contrastive learning. During online inference, a clustering method uses the learned features to cluster foreground pixels into object segments. Experiments highlight the method's effectiveness on both real and synthetic video datasets, which include cluttered scenes of tabletop objects. The proposed method outperforms existing unsupervised methods for object segmentation by a large margin.

Via

Access Paper or Ask Questions