Abstract:The integration of tactile sensing into compliant soft robotic grippers offers a compelling pathway toward advanced robotic grasping and safer human-robot interactions. Visual-tactile sensors realize high-resolution, large-area tactile perception with affordable cameras. However, conventional visual-tactile sensors rely heavily on rigid forms, sacrificing finger compliance and sensing regions to achieve localized tactile feedback. Enabling seamless, large-area tactile sensing in soft grippers remains challenging, as deformations inherent to soft structures can obstruct the optical path and restrict the camera's field of view. To address these, we present Gelsight FlexiRay, a multimodal visual-tactile sensor designed for safe and compliant interactions with substantial structural deformation through integration with Finray Effect grippers. First, we adopt a multi-mirror configuration, which is systematically modeled and optimized based on the physical force-deformation characteristics of FRE grippers. Second, we enhanced Gelsight FlexiRay with human-like multimodal perception, including contact force and location, proprioception, temperature, texture, and slippage. Experiments demonstrate Gelsight FlexiRay's robust tactile performance across diverse deformation states, achieving a force measurement accuracy of 0.14 N and proprioceptive positioning accuracy of 0.19 mm. Compared with state of art compliant VTS, the FlexiRay demonstrates 5 times larger structural deformation under the same loads. Its expanded sensing area and ability to distinguish contact information and execute grasping and classification tasks highlights its potential for versatile, large-area multimodal tactile sensing integration within soft robotic systems. This work establishes a foundation for flexible, high-resolution tactile sensing in compliant robotic applications.
Abstract:In debating, rebuttal is one of the most critical stages, where a speaker addresses the arguments presented by the opposing side. During this process, the speaker synthesizes their own persuasive articulation given the context from the opposing side. This work proposes a novel zero-shot text-to-speech synthesis system for rebuttal, namely Debatts. Debatts takes two speech prompts, one from the opposing side (i.e. opponent) and one from the speaker. The prompt from the opponent is supposed to provide debating style prosody, and the prompt from the speaker provides identity information. In particular, we pretrain the Debatts system from in-the-wild dataset, and integrate an additional reference encoder to take debating prompt for style. In addition, we also create a debating dataset to develop Debatts. In this setting, Debatts can generate a debating-style speech in rebuttal for any voices. Experimental results confirm the effectiveness of the proposed system in comparison with the classic zero-shot TTS systems.
Abstract:Nowadays, large-scale text-to-speech (TTS) systems are primarily divided into two types: autoregressive and non-autoregressive. The autoregressive systems have certain deficiencies in robustness and cannot control speech duration. In contrast, non-autoregressive systems require explicit prediction of phone-level duration, which may compromise their naturalness. We introduce the Masked Generative Codec Transformer (MaskGCT), a fully non-autoregressive model for TTS that does not require precise alignment information between text and speech. MaskGCT is a two-stage model: in the first stage, the model uses text to predict semantic tokens extracted from a speech self-supervised learning (SSL) model, and in the second stage, the model predicts acoustic tokens conditioned on these semantic tokens. MaskGCT follows the \textit{mask-and-predict} learning paradigm. During training, MaskGCT learns to predict masked semantic or acoustic tokens based on given conditions and prompts. During inference, the model generates tokens of a specified length in a parallel manner. We scale MaskGCT to a large-scale multilingual dataset with 100K hours of in-the-wild speech. Our experiments demonstrate that MaskGCT achieves superior or competitive performance compared to state-of-the-art zero-shot TTS systems in terms of quality, similarity, and intelligibility while offering higher generation efficiency than diffusion-based or autoregressive TTS models. Audio samples are available at https://maskgct.github.io.
Abstract:This study investigates the stiffness characteristics of the Sprint Z3 head, also known as 3-PRS Parallel Kinematics Machines, which are among the most extensively researched and viably successful manipulators for precision machining applications. Despite the wealth of research on these robotic manipulators, no previous work has demonstrated their stiffness performance within the parasitic motion space. Such an undesired motion influences their stiffness properties, as stiffness is configuration-dependent. Addressing this gap, this paper develops a stiffness model that accounts for both the velocity-level parasitic motion space and the regular workspace. Numerical simulations are provided to illustrate the stiffness characteristics of the manipulator across all considered spaces. The results indicate that the stiffness profile within the parasitic motion space is both shallower and the values are smaller when compared to the stiffness distribution across the orientation workspace. This implies that evaluating a manipulator's performance adequately requires assessing its ability to resist external loads during parasitic motion. Therefore, comprehending this aspect is crucial for redesigning components to enhance overall stiffness.
Abstract:Recent research on mobile robots has focused on increasing their adaptability to unpredictable and unstructured environments using soft materials and structures. However, the determination of key design parameters and control over these compliant robots are predominantly iterated through experiments, lacking a solid theoretical foundation. To improve their efficiency, this paper aims to provide mathematics modeling over two locomotion, crawling and swimming. Specifically, a dynamic model is first devised to reveal the influence of the contact surfaces' frictional coefficients on displacements in different motion phases. Besides, a swimming kinematics model is provided using coordinate transformation, based on which, we further develop an algorithm that systematically plans human-like swimming gaits, with maximum thrust obtained. The proposed algorithm is highly generalizable and has the potential to be applied in other soft robots with multiple joints. Simulation experiments have been conducted to illustrate the effectiveness of the proposed modeling.
Abstract:Under-actuated robot grippers as a pervasive tool of robots have become a considerable research focus. Despite their simplicity of mechanical design and control strategy, they suffer from poor versatility and weak adaptability, making widespread applications limited. To better relieve relevant research gaps, we present a novel 3-finger linkage-based gripper that realizes retractable and reconfigurable multi-mode grasps driven by a single motor. Firstly, inspired by the changes that occurred in the contact surface with a human finger moving, we artfully design a slider-slide rail mechanism as the phalanx to achieve retraction of each finger, allowing for better performance in the enveloping grasping mode. Secondly, a reconfigurable structure is constructed to broaden the grasping range of objects' dimensions for the proposed gripper. By adjusting the configuration and gesture of each finger, the gripper can achieve five grasping modes. Thirdly, the proposed gripper is just actuated by a single motor, yet it can be capable of grasping and reconfiguring simultaneously. Finally, various experiments on grasps of slender, thin, and large-volume objects are implemented to evaluate the performance of the proposed gripper in practical scenarios, which demonstrates the excellent grasping capabilities of the gripper.
Abstract:Compliant grippers, owing to adaptivity and safety, have attracted considerable attention for unstructured grasping in real applications, such as industrial or logistic scenarios. However, accurate construction of the mathematical model depicting the bidirectional relationship between shape deformation and contact force for such grippers, such as the Fin-Ray grippers, remains stagnant to date. To address this research gap, this article devises, presents, and experimentally validates a universal bidirectional force-displacement mathematical model for compliant grippers based on the co-rotational concept, which endows such grippers with an intrinsic force sensing capability and offers a better insight into the design optimization. In Part 1 of the article, we introduce the fundamental theory of the co-rotational approach, where arbitrary large deformation of beam elements can be modeled. Its intrinsic principle enables the theoretical modeling to consider various types of configurations and key design parameters with very few assumptions made. Further, a force control algorithm is proposed, providing accurate displacement estimations of the gripper under external forces with minor computational loads. The performance of the proposed method is experimentally verified through comparison with Finite Element Analysis, where the influence of four key design parameters on the gripper s performance is investigated, facilitating systematical design optimization. Part 2 of this article demonstrating the force sensing capabilities and the effects of representative co-rotational modeling parameters on model accuracy is released in Google Drive.
Abstract:Compliant grasping is an essential capability for most robots in practical applications. For compliant robotic end-effectors that commonly appear in industrial or logistic scenarios, such as Fin-Ray gripper, it still remains challenging to build a bidirectional mathematical model that mutually maps the shape deformation and contact force. Part I of this article has constructed the force-displacement relationship for design optimization through the co-rotational theory with very few assumptions. In Part II, we further devise a detailed displacement-force mathematical model, enabling the compliant gripper to precisely estimate contact force sensor-free. Specifically, the proposed approach based on the co-rotational theory can calculate contact forces from deformations. The presented displacement-control algorithm elaborately investigates contact forces and provides force feedback for a force control system of a gripper, where deformation appears as displacements in contact points. Afterward, simulation experiments are conducted to evaluate the performance of the proposed model through comparisons with the finite-element analysis (FEA). Simulation results reveal that the proposed model accurately estimates contact force, with an average error of around 5% throughout all single/multiple node cases, regardless of various design parameters (Part I of this article is released in Google Drive).