Abstract:Surgical scene understanding in Robot-assisted Minimally Invasive Surgery (RMIS) is highly reliant on visual cues and lacks tactile perception. Force-modulated surgical palpation with tactile feedback is necessary for localization, geometry/depth estimation, and dexterous exploration of abnormal stiff inclusions in subsurface tissue layers. Prior works explored surface-level tissue abnormalities or single layered tissue-tumor embeddings with more than 300 palpations for dense 2D stiffness mapping. Our approach focuses on 3D reconstructions of sub-dermal tumor surface profiles in multi-layered tissue (skin-fat-muscle) using a visually-guided novel tactile navigation policy. A robotic palpation probe with tri-axial force sensing was leveraged for tactile exploration of the phantom. From a surface mesh of the surgical region initialized from a depth camera, the policy explores a surgeon's region of interest through palpation, sampled from bayesian optimization. Each palpation includes contour following using a contact-safe impedance controller to trace the sub-dermal tumor geometry, until the underlying tumor-tissue boundary is reached. Projections of these contour following palpation trajectories allows 3D reconstruction of the subdermal tumor surface profile in less than 100 palpations. Our approach generates high-fidelity 3D surface reconstructions of rigid tumor embeddings in tissue layers with isotropic elasticities, although soft tumor geometries are yet to be explored.
Abstract:UniT is a novel approach to tactile representation learning, using VQVAE to learn a compact latent space and serve as the tactile representation. It uses tactile images obtained from a single simple object to train the representation with transferability and generalizability. This tactile representation can be zero-shot transferred to various downstream tasks, including perception tasks and manipulation policy learning. Our benchmarking on an in-hand 3D pose estimation task shows that UniT outperforms existing visual and tactile representation learning methods. Additionally, UniT's effectiveness in policy learning is demonstrated across three real-world tasks involving diverse manipulated objects and complex robot-object-environment interactions. Through extensive experimentation, UniT is shown to be a simple-to-train, plug-and-play, yet widely effective method for tactile representation learning. For more details, please refer to our open-source repository https://github.com/ZhengtongXu/UniT and the project website https://zhengtongxu.github.io/unifiedtactile.github.io/.