Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kenji Shimada

Semantic Exploration and Dense Mapping of Complex Environments using Ground Robots Equipped with LiDAR and Panoramic Camera

May 28, 2025

Xiaoyang Zhan, Shixin Zhou, Qianqian Yang, Yixuan Zhao, Hao Liu, Srinivas Chowdary Ramineni, Kenji Shimada

Abstract:This paper presents a system for autonomous semantic exploration and dense semantic target mapping of a complex unknown environment using a ground robot equipped with a LiDAR-panoramic camera suite. Existing approaches often struggle to balance collecting high-quality observations from multiple view angles and avoiding unnecessary repetitive traversal. To fill this gap, we propose a complete system combining mapping and planning. We first redefine the task as completing both geometric coverage and semantic viewpoint observation. We then manage semantic and geometric viewpoints separately and propose a novel Priority-driven Decoupled Local Sampler to generate local viewpoint sets. This enables explicit multi-view semantic inspection and voxel coverage without unnecessary repetition. Building on this, we develop a hierarchical planner to ensure efficient global coverage. In addition, we propose a Safe Aggressive Exploration State Machine, which allows aggressive exploration behavior while ensuring the robot's safety. Our system includes a plug-and-play semantic target mapping module that integrates seamlessly with state-of-the-art SLAM algorithms for pointcloud-level dense semantic target mapping. We validate our approach through extensive experiments in both realistic simulations and complex real-world environments. Simulation results show that our planner achieves faster exploration and shorter travel distances while guaranteeing a specified number of multi-view inspections. Real-world experiments further confirm the system's effectiveness in achieving accurate dense semantic object mapping of unstructured environments.

Via

Access Paper or Ask Questions

Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task

Apr 15, 2025

Aviral Chharia, Tianyu Ren, Tomotake Furuhata, Kenji Shimada

Abstract:Recognizing safety violations in construction environments is critical yet remains underexplored in computer vision. Existing models predominantly rely on 2D object detection, which fails to capture the complexities of real-world violations due to: (i) an oversimplified task formulation treating violation recognition merely as object detection, (ii) inadequate validation under realistic conditions, (iii) absence of standardized baselines, and (iv) limited scalability from the unavailability of synthetic dataset generators for diverse construction scenarios. To address these challenges, we introduce Safe-Construct, the first framework that reformulates violation recognition as a 3D multi-view engagement task, leveraging scene-level worker-object context and 3D spatial understanding. We also propose the Synthetic Indoor Construction Site Generator (SICSG) to create diverse, scalable training data, overcoming data limitations. Safe-Construct achieves a 7.6% improvement over state-of-the-art methods across four violation types. We rigorously evaluate our approach in near-realistic settings, incorporating four violations, four workers, 14 objects, and challenging conditions like occlusions (worker-object, worker-worker) and variable illumination (back-lighting, overexposure, sunlight). By integrating 3D multi-view spatial understanding and synthetic data generation, Safe-Construct sets a new benchmark for scalable and robust safety monitoring in high-risk industries. Project Website: https://Safe-Construct.github.io/Safe-Construct

* CVPR Workshop 2025; Project Website: https://Safe-Construct.github.io/Safe-Construct

Via

Access Paper or Ask Questions

Adaptive Planning Framework for UAV-Based Surface Inspection in Partially Unknown Indoor Environments

Apr 12, 2025

Hanyu Jin, Zhefan Xu, Haoyu Shen, Xinming Han, Kanlong Ye, Kenji Shimada

Abstract:Inspecting indoor environments such as tunnels, industrial facilities, and construction sites is essential for infrastructure monitoring and maintenance. While manual inspection in these environments is often time-consuming and potentially hazardous, Unmanned Aerial Vehicles (UAVs) can improve efficiency by autonomously handling inspection tasks. Such inspection tasks usually rely on reference maps for coverage planning. However, in industrial applications, only the floor plans are typically available. The unforeseen obstacles not included in the floor plans will result in outdated reference maps and inefficient or unsafe inspection trajectories. In this work, we propose an adaptive inspection framework that integrates global coverage planning with local reactive adaptation to improve the coverage and efficiency of UAV-based inspection in partially unknown indoor environments. Experimental results in structured indoor scenarios demonstrate the effectiveness of the proposed approach in inspection efficiency and achieving high coverage rates with adaptive obstacle handling, highlighting its potential for enhancing the efficiency of indoor facility inspection.

Via

Access Paper or Ask Questions

LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation

Feb 28, 2025

Zhefan Xu, Haoyu Shen, Xinming Han, Hanyu Jin, Kanlong Ye, Kenji Shimada

Abstract:Accurate perception of dynamic obstacles is essential for autonomous robot navigation in indoor environments. Although sophisticated 3D object detection and tracking methods have been investigated and developed thoroughly in the fields of computer vision and autonomous driving, their demands on expensive and high-accuracy sensor setups and substantial computational resources from large neural networks make them unsuitable for indoor robotics. Recently, more lightweight perception algorithms leveraging onboard cameras or LiDAR sensors have emerged as promising alternatives. However, relying on a single sensor poses significant limitations: cameras have limited fields of view and can suffer from high noise, whereas LiDAR sensors operate at lower frequencies and lack the richness of visual features. To address this limitation, we propose a dynamic obstacle detection and tracking framework that uses both onboard camera and LiDAR data to enable lightweight and accurate perception. Our proposed method expands on our previous ensemble detection approach, which integrates outputs from multiple low-accuracy but computationally efficient detectors to ensure real-time performance on the onboard computer. In this work, we propose a more robust fusion strategy that integrates both LiDAR and visual data to enhance detection accuracy further. We then utilize a tracking module that adopts feature-based object association and the Kalman filter to track and estimate detected obstacles' states. Besides, a dynamic obstacle classification algorithm is designed to robustly identify moving objects. The dataset evaluation demonstrates a better perception performance compared to benchmark methods. The physical experiments on a quadcopter robot confirms the feasibility for real-world navigation.

* 8 pages, 7 figures, 2 tables

Via

Access Paper or Ask Questions

Heuristic-based Incremental Probabilistic Roadmap for Efficient UAV Exploration in Dynamic Environments

Sep 17, 2023

Zhefan Xu, Christopher Suzuki, Xiaoyang Zhan, Kenji Shimada

Abstract:Autonomous exploration in dynamic environments necessitates a planner that can proactively respond to changes and make efficient and safe decisions for robots. Although plenty of sampling-based works have shown success in exploring static environments, their inherent sampling randomness and limited utilization of previous samples often result in sub-optimal exploration efficiency. Additionally, most of these methods struggle with efficient replanning and collision avoidance in dynamic settings. To overcome these limitations, we propose the Heuristic-based Incremental Probabilistic Roadmap Exploration (HIRE) planner for UAVs exploring dynamic environments. The proposed planner adopts an incremental sampling strategy based on the probabilistic roadmap constructed by heuristic sampling toward the unexplored region next to the free space, defined as the heuristic frontier regions. The heuristic frontier regions are detected by applying a lightweight vision-based method to the different levels of the occupancy map. Moreover, our dynamic module ensures that the planner dynamically updates roadmap information based on the environment changes and avoids dynamic obstacles. Simulation and physical experiments prove that our planner can efficiently and safely explore dynamic environments.

Via

Access Paper or Ask Questions

Quadcopter Trajectory Time Minimization and Robust Collision Avoidance via Optimal Time Allocation

Sep 15, 2023

Zhefan Xu, Kenji Shimada

Figure 1 for Quadcopter Trajectory Time Minimization and Robust Collision Avoidance via Optimal Time Allocation

Figure 2 for Quadcopter Trajectory Time Minimization and Robust Collision Avoidance via Optimal Time Allocation

Figure 3 for Quadcopter Trajectory Time Minimization and Robust Collision Avoidance via Optimal Time Allocation

Figure 4 for Quadcopter Trajectory Time Minimization and Robust Collision Avoidance via Optimal Time Allocation

Abstract:Autonomous navigation requires robots to generate trajectories for collision avoidance efficiently. Although plenty of previous works have proven successful in generating smooth and spatially collision-free trajectories, their solutions often suffer from suboptimal time efficiency and potential unsafety, particularly when accounting for uncertainties in robot perception and control. To address this issue, this paper presents the Robust Optimal Time Allocation (ROTA) framework. This framework is designed to optimize the time progress of the trajectories temporally, serving as a post-processing tool to enhance trajectory time efficiency and safety under uncertainties. In this study, we begin by formulating a non-convex optimization problem aimed at minimizing trajectory execution time while incorporating constraints on collision probability as the robot approaches obstacles. Subsequently, we introduce the concept of the trajectory braking zone and adopt the chance-constrained formulation for robust collision avoidance in the braking zones. Finally, the non-convex optimization problem is reformulated into a second-order cone programming problem to achieve real-time performance. Through simulations and physical flight experiments, we demonstrate that the proposed approach effectively reduces trajectory execution time while enabling robust collision avoidance in complex environments.

Via

Access Paper or Ask Questions

Improving Deep Learning-based Defect Detection on Window Frames with Image Processing Strategies

Sep 13, 2023

Jorge Vasquez, Hemant K. Sharma, Tomotake Furuhata, Kenji Shimada

Abstract:Detecting subtle defects in window frames, including dents and scratches, is vital for upholding product integrity and sustaining a positive brand perception. Conventional machine vision systems often struggle to identify these defects in challenging environments like construction sites. In contrast, modern vision systems leveraging machine and deep learning (DL) are emerging as potent tools, particularly for cosmetic inspections. However, the promise of DL is yet to be fully realized. A few manufacturers have established a clear strategy for AI integration in quality inspection, hindered mainly by issues like scarce clean datasets and environmental changes that compromise model accuracy. Addressing these challenges, our study presents an innovative approach that amplifies defect detection in DL models, even with constrained data resources. The paper proposes a new defect detection pipeline called InspectNet (IPT-enhanced UNET) that includes the best combination of image enhancement and augmentation techniques for pre-processing the dataset and a Unet model tuned for window frame defect detection and segmentation. Experiments were carried out using a Spot Robot doing window frame inspections . 16 variations of the dataset were constructed using different image augmentation settings. Results of the experiments revealed that, on average, across all proposed evaluation measures, Unet outperformed all other algorithms when IPT-enhanced augmentations were applied. In particular, when using the best dataset, the average Intersection over Union (IoU) values achieved were IPT-enhanced Unet, reaching 0.91 of mIoU.

Via

Access Paper or Ask Questions

Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

Feb 28, 2023

Zhefan Xu, Xiaoyang Zhan, Yumeng Xiu, Christopher Suzuki, Kenji Shimada

Figure 1 for Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

Figure 2 for Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

Figure 3 for Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

Figure 4 for Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

Abstract:Deploying autonomous robots in crowded indoor environments usually requires them to have accurate dynamic obstacle perception. Although plenty of previous works in the autonomous driving field have investigated the 3D object detection problem, the usage of dense point clouds from a heavy LiDAR and their high computation cost for learning-based data processing make those methods not applicable to lightweight robots, such as vision-based UAVs with small onboard computers. To address this issue, we propose a lightweight 3D dynamic obstacle detection and tracking (DODT) method based on an RGB-D camera. Our method adopts a novel ensemble detection strategy, combining multiple computationally efficient but low-accuracy detectors to achieve real-time high-accuracy obstacle detection. Besides, we introduce a new feature-based data association method to prevent mismatches and use the Kalman filter with the constant acceleration model to track detected obstacles. In addition, our system includes an optional and auxiliary learning-based module to enhance the obstacle detection range and dynamic obstacle identification. The users can determine whether or not to run this module based on the available computation resources. The proposed method is implemented in a lightweight quadcopter, and the experiments prove that the algorithm can make the robot detect dynamic obstacles and navigate dynamic environments safely.

* 8 pages, 12 figures, 2 tables

Via

Access Paper or Ask Questions

A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstacles

Jan 20, 2023

Zhefan Xu, Baihan Chen, Xiaoyang Zhan, Yumeng Xiu, Christopher Suzuki, Kenji Shimada

Abstract:Tunnel construction using the drill-and-blast method requires the 3D measurement of the excavation front to evaluate underbreak locations. Considering the inspection and measurement task's safety, cost, and efficiency, deploying lightweight autonomous robots, such as unmanned aerial vehicles (UAV), becomes more necessary and popular. Most of the previous works use a prior map for inspection viewpoint determination and do not consider dynamic obstacles. To maximally increase the level of autonomy, this paper proposes a vision-based UAV inspection framework for dynamic tunnel environments without using a prior map. Our approach utilizes a hierarchical planning scheme, decomposing the inspection problem into different levels. The high-level decision maker first determines the task for the robot and generates the target point. Then, the mid-level path planner finds the waypoint path and optimizes the collision-free static trajectory. Finally, the static trajectory will be fed into the low-level local planner to avoid dynamic obstacles and navigate to the target point. Besides, our framework contains a novel dynamic map module that can simultaneously track dynamic obstacles and represent static obstacles based on an RGB-D camera. After inspection, the Structure-from-Motion (SfM) pipeline is applied to generate the 3D shape of the target. To our best knowledge, this is the first time autonomous inspection has been realized in unknown and dynamic tunnel environments. Our flight experiments in a real tunnel prove that our method can autonomously inspect the tunnel excavation front surface.

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

Dec 01, 2022

Wentai Zhang, Joe Joseph, Yue Yin, Liuyue Xie, Tomotake Furuhata, Soji Yamakawa, Kenji Shimada, Levent Burak Kara

Figure 1 for Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

Figure 2 for Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

Figure 3 for Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

Figure 4 for Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

Abstract:We present a data-driven framework to automate the vectorization and machine interpretation of 2D engineering part drawings. In industrial settings, most manufacturing engineers still rely on manual reads to identify the topological and manufacturing requirements from drawings submitted by designers. The interpretation process is laborious and time-consuming, which severely inhibits the efficiency of part quotation and manufacturing tasks. While recent advances in image-based computer vision methods have demonstrated great potential in interpreting natural images through semantic segmentation approaches, the application of such methods in parsing engineering technical drawings into semantically accurate components remains a significant challenge. The severe pixel sparsity in engineering drawings also restricts the effective featurization of image-based data-driven methods. To overcome these challenges, we propose a deep learning based framework that predicts the semantic type of each vectorized component. Taking a raster image as input, we vectorize all components through thinning, stroke tracing, and cubic bezier fitting. Then a graph of such components is generated based on the connectivity between the components. Finally, a graph convolutional neural network is trained on this graph data to identify the semantic type of each component. We test our framework in the context of semantic segmentation of text, dimension and, contour components in engineering drawings. Results show that our method yields the best performance compared to recent image, and graph-based segmentation methods.

* Preprint submitted to Computers in Industry

Via

Access Paper or Ask Questions