Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yasuhiro Yoshimura

GSplatVNM: Point-of-View Synthesis for Visual Navigation Models Using Gaussian Splatting

Mar 07, 2025

Kohei Honda, Takeshi Ishita, Yasuhiro Yoshimura, Ryo Yonitani

Abstract:This paper presents a novel approach to image-goal navigation by integrating 3D Gaussian Splatting (3DGS) with Visual Navigation Models (VNMs), a method we refer to as GSplatVNM. VNMs offer a promising paradigm for image-goal navigation by guiding a robot through a sequence of point-of-view images without requiring metrical localization or environment-specific training. However, constructing a dense and traversable sequence of target viewpoints from start to goal remains a central challenge, particularly when the available image database is sparse. To address these challenges, we propose a 3DGS-based viewpoint synthesis framework for VNMs that synthesizes intermediate viewpoints to seamlessly bridge gaps in sparse data while significantly reducing storage overhead. Experimental results in a photorealistic simulator demonstrate that our approach not only enhances navigation efficiency but also exhibits robustness under varying levels of image database sparsity.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems

Sep 30, 2024

Matthew Ishige, Yasuhiro Yoshimura, Ryo Yonetani

Abstract:This paper presents opt-in camera, a concept of privacy-preserving camera systems capable of recording only specific individuals in a crowd who explicitly consent to be recorded. Our system utilizes a mobile wireless communication tag attached to personal belongings as proof of opt-in and as a means of localizing tag carriers in video footage. Specifically, the on-ground positions of the wireless tag are first tracked over time using the unscented Kalman filter (UKF). The tag trajectory is then matched against visual tracking results for pedestrians found in videos to identify the tag carrier. Technically, we devise a dedicated trajectory matching technique based on constrained linear optimization, as well as a novel calibration technique that handles wireless tag-camera calibration and hyperparameter tuning for the UKF, which mitigates the non-line-of-sight (NLoS) issue in wireless localization. We realize the proposed opt-in camera system using ultra-wideband (UWB) devices and an off-the-shelf webcam installed in the environment. Experimental results demonstrate that our system can perform opt-in recording of individuals in near real-time at 10 fps, with reliable identification accuracy for a crowd of 8-23 people in a confined space.

* 7 pages, 6 figures, submitted to international conference on robotics and automation (ICRA) 2025

Via

Access Paper or Ask Questions