Abstract:This paper presents T3: Transferable Tactile Transformers, a framework for tactile representation learning that scales across multi-sensors and multi-tasks. T3 is designed to overcome the contemporary issue that camera-based tactile sensing is extremely heterogeneous, i.e. sensors are built into different form factors, and existing datasets were collected for disparate tasks. T3 captures the shared latent information across different sensor-task pairings by constructing a shared trunk transformer with sensor-specific encoders and task-specific decoders. The pre-training of T3 utilizes a novel Foundation Tactile (FoTa) dataset, which is aggregated from several open-sourced datasets and it contains over 3 million data points gathered from 13 sensors and 11 tasks. FoTa is the largest and most diverse dataset in tactile sensing to date and it is made publicly available in a unified format. Across various sensors and tasks, experiments show that T3 pre-trained with FoTa achieved zero-shot transferability in certain sensor-task pairings, can be further fine-tuned with small amounts of domain-specific data, and its performance scales with bigger network sizes. T3 is also effective as a tactile encoder for long horizon contact-rich manipulation. Results from sub-millimeter multi-pin electronics insertion tasks show that T3 achieved a task success rate 25% higher than that of policies trained with tactile encoders trained from scratch, or 53% higher than without tactile sensing. Data, code, and model checkpoints are open-sourced at https://t3.alanz.info.
Abstract:Compared to fully-actuated robotic end-effectors, underactuated ones are generally more adaptive, robust, and cost-effective. However, state estimation for underactuated hands is usually more challenging. Vision-based tactile sensors, like Gelsight, can mitigate this issue by providing high-resolution tactile sensing and accurate proprioceptive sensing. As such, we present GelLink, a compact, underactuated, linkage-driven robotic finger with low-cost, high-resolution vision-based tactile sensing and proprioceptive sensing capabilities. In order to reduce the amount of embedded hardware, i.e. the cameras and motors, we optimize the linkage transmission with a planar linkage mechanism simulator and develop a planar reflection simulator to simplify the tactile sensing hardware. As a result, GelLink only requires one motor to actuate the three phalanges, and one camera to capture tactile signals along the entire finger. Overall, GelLink is a compact robotic finger that shows adaptability and robustness when performing grasping tasks. The integration of vision-based tactile sensors can significantly enhance the capabilities of underactuated fingers and potentially broaden their future usage.
Abstract:Compliant grippers enable robots to work with humans in unstructured environments. In general, these grippers can improve with tactile sensing to estimate the state of objects around them to precisely manipulate objects. However, co-designing compliant structures with high-resolution tactile sensing is a challenging task. We propose a simulation framework for the end-to-end forward design of GelSight Fin Ray sensors. Our simulation framework consists of mechanical simulation using the finite element method (FEM) and optical simulation including physically based rendering (PBR). To simulate the fluorescent paint used in these GelSight Fin Rays, we propose an efficient method that can be directly integrated in PBR. Using the simulation framework, we investigate design choices available in the compliant grippers, namely gel pad shapes, illumination conditions, Fin Ray gripper sizes, and Fin Ray stiffness. This infrastructure enables faster design and prototype time frames of new Fin Ray sensors that have various sensing areas, ranging from 48 mm $\times$ \18 mm to 70 mm $\times$ 35 mm. Given the parameters we choose, we can thus optimize different Fin Ray designs and show their utility in grasping day-to-day objects.
Abstract:The synthesis of tactile sensing with compliance is essential to many fields, from agricultural usages like fruit picking, to sustainability practices such as sorting recycling, to the creation of safe home-care robots for the elderly to age with dignity. From tactile sensing, we can discern material properties, recognize textures, and determine softness, while with compliance, we are able to securely and safely interact with the objects and the environment around us. These two abilities can culminate into a useful soft robotic gripper, such as the original GelSight Fin Ray, which is able to grasp a large variety of different objects and also perform a simple household manipulation task: wine glass reorientation. Although the original GelSight Fin Ray solves the problem of interfacing a generally rigid, high-resolution sensor with a soft, compliant structure, we can improve the robustness of the sensor and implement techniques that make such camera-based tactile sensors applicable to a wider variety of soft robot designs. We first integrate flexible mirrors and incorporate the rigid electronic components into the base of the gripper, which greatly improves the compliance of the Fin Ray structure. Then, we synthesize a flexible and high-elongation silicone adhesive-based fluorescent paint, which can provide good quality 2D tactile localization results for our sensor. Finally, we incorporate all of these techniques into a new design: the Baby Fin Ray, which we use to dig through clutter, and perform successful classification of nuts in their shells. The supplementary video can be found here: https://youtu.be/_oD_QFtYTPM