Abstract:Over the past decades, we have witnessed a rapid emergence of soft and reconfigurable robots thanks to their capability to interact safely with humans and adapt to complex environments. However, their softness makes accurate control very challenging. High-fidelity sensing is critical in improving control performance, especially posture and contact estimation. To this end, traditional camera-based sensors and load cells have limited portability and accuracy, and they will inevitably increase the robot's cost and weight. In this study, instead of using specialized sensors, we only collect distributed pressure data inside a pneumatics-driven soft arm and apply the physical reservoir computing principle to simultaneously predict its kinematic posture (i.e., bending angle) and payload status (i.e., payload mass). Our results show that, with careful readout training, one can obtain accurate bending angle and payload mass predictions via simple, weighted linear summations of pressure readings. In addition, our comparative analysis shows that, to guarantee low prediction errors within 10\%, bending angle prediction requires less training data than payload prediction. This result reveals that balanced linear and nonlinear body dynamics are critical for the physical reservoir to accomplish complex proprioceptive and exteroceptive information perception tasks. Finally, the method of exploring the most efficient readout training methods presented in this paper could be extended to other soft robotic systems to maximize their perception capabilities.
Abstract:This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generating images for commercial applications. Despite their efficacy, these add-on modules incur high loading overhead, prolong the serving latency, and swallow up expensive GPU resources. Driven by our characterization study, we present SwiftDiffusion, a system that efficiently generates high-quality images using stable diffusion models and add-on modules. To achieve this, SwiftDiffusion reconstructs the existing text-to-image serving workflow by identifying the opportunities for parallel computation and distributing ControlNet computations across multiple GPUs. Further, SwiftDiffusion thoroughly analyzes the dynamics of image generation and develops techniques to eliminate the overhead associated with LoRA loading and patching while preserving the image quality. Last, SwiftDiffusion proposes specialized optimizations in the backbone architecture of the stable diffusion models, which are also compatible with the efficient serving of add-on modules. Compared to state-of-the-art text-to-image serving systems, SwiftDiffusion reduces serving latency by up to 5x and improves serving throughput by up to 2x without compromising image quality.
Abstract:Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addressed this issue by focusing on disentangling strategy or utilizing object-based conditional probabilities to constrain the selection space of attributes. Unfortunately, few studies have explored the problem from the perspective of modeling the mechanism of visual primitive interactions. Inspired by the success of vanilla adversarial learning in Cross-Domain Few-Shot Learning, we take a step further and devise a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem. Besides, the latest studies highlight the weakness of the perception of hard compositions even under data-balanced conditions. To this end, we propose a novel over-sampling strategy with object-similarity guidance to augment target compositional training data. We performed detailed quantitative analysis and retrieval experiments on well-established datasets, such as UT-Zappos50K, MIT-States, and C-GQA, to validate the effectiveness of our proposed method, and the state-of-the-art (SOTA) performance demonstrates the superiority of our approach. The code is available at https://github.com/lisuyi/PBadv_czsl.
Abstract:Yoshimura origami is a classical folding pattern that has inspired many deployable structure designs. Its applications span from space exploration, kinetic architectures, and soft robots to even everyday household items. However, despite its wide usage, Yoshimura has been fixated on a set of design constraints to ensure its flat-foldability. Through extensive kinematic analysis and prototype tests, this study presents a new Yoshimura that intentionally defies these constraints. Remarkably, one can impart a unique meta-stability by using the Golden Ratio angle to define the triangular facets of a generalized Yoshimura. As a result, when its facets are strategically popped out, a ``Golden Ratio Yoshimura'' boom with $m$ modules can be theoretically reconfigured into $8^m$ geometrically unique and load-bearing shapes. This result not only challenges the existing design norms but also opens up a new avenue to create deployable and versatile structural systems.
Abstract:This study presents a modular, electronics-free, and fully onboard control and actuation approach for SMA-based soft robots to achieve locomotion tasks. This approach exploits the nonlinear mechanics of compliant curved beams and carefully designed mechanical control circuits to create and synchronize rhythmic deformation cycles, mimicking the central pattern generators (CPG) prevalent in animal locomotions. More specifically, the study elucidates a new strategy to amplify the actuation performance of the shape memory coil actuator by coupling it to a carefully designed, mono-stable curve beam with a snap-through buckling behavior. Such SMA-curved beam assembly is integrated with an entirely mechanical circuit featuring a slider mechanism. This circuit can automatically cut off and supply current to the SMA according to its deformation status, generating a self-sustained rhythmic deformation cycle using a simple DC power supply. Finally, this study presents a new strategy to coordinate (synchronize) two rhythmic deformation cycles from two robotic modules to achieve efficient crawling locomotion but still use a single DC power. This work represents a significant step towards fully autonomous, electronics-free SMA-based locomotion robots with fully onboard actuation and control.
Abstract:In this paper, we experimentally examine the cognitive capability of a simple, paper-based Miura-ori -- using the physical reservoir computing framework -- to achieve different information perception tasks. The body dynamics of Miura-ori (aka. its vertices displacements), which is excited by a simple harmonic base excitation, can be exploited as the reservoir computing resource. By recording these dynamics with a high-resolution camera and image processing program and then using linear regression for training, we show that the origami reservoir has sufficient computing capacity to estimate the weight and position of a payload. It can also recognize the input frequency and magnitude patterns. Furthermore, multitasking is achievable by simultaneously applying two targeted functions to the same reservoir state matrix. Therefore, we demonstrate that Miura-ori can assess the dynamic interactions between its body and ambient environment to extract meaningful information -- an intelligent behavior in the mechanical domain. Given that Miura-ori has been widely used to construct deployable structures, lightweight materials, and compliant robots, enabling such information perception tasks can add a new dimension to the functionality of such a versatile structure.
Abstract:A new paradigm called physical reservoir computing has recently emerged, where the nonlinear dynamics of high-dimensional and fixed physical systems are harnessed as a computational resource to achieve complex tasks. Via extensive simulations based on a dynamic truss-frame model, this study shows that an origami structure can perform as a dynamic reservoir with sufficient computing power to emulate high-order nonlinear systems, generate stable limit cycles, and modulate outputs according to dynamic inputs. This study also uncovers the linkages between the origami reservoir's physical designs and its computing power, offering a guideline to optimize the computing performance. Comprehensive parametric studies show that selecting optimal feedback crease distribution and fine-tuning the underlying origami folding designs are the most effective approach to improve computing performance. Furthermore, this study shows how origami's physical reservoir computing power can apply to soft robotic control problems by a case study of earthworm-like peristaltic crawling without traditional controllers. These results can pave the way for origami-based robots with embodied mechanical intelligence.
Abstract:Via numerical simulation and experimental assessment, this study examines the use of origami folding to develop robotic jumping mechanisms with tailored nonlinear stiffness to improve dynamic performance. Specifically, we use Tachi-Miura Polyhedron (TMP) bellow origami -- which exhibits a nonlinear "strain-softening" force-displacement curve -- as a jumping robotic skeleton with embedded energy storage. TMP's nonlinear stiffness allows it to store more energy than a linear spring and offers improved jumping height and airtime. Moreover, the nonlinearity can be tailored by directly changing the underlying TMP crease geometry. A critical challenge is to minimize the TMP's hysteresis and energy loss during its compression stage right before jumping. So we used the plastically annealed lamina emergent origami (PALEO) concept to modify the TMP creases. PALEO increases the folding limit before plastic deformation occurs, thus improving the overall strain energy retention. Jumping experiments confirmed that a nonlinear TMP mechanism achieved roughly 9% improvement in air time and a 13% improvement in jumping height compared to a "control" TMP sample with a relatively linear stiffness. This study's results validate the advantages of using origami in robotic jumping mechanisms and demonstrate the benefits of utilizing nonlinear spring elements for improving jumping performance. Therefore, they could foster a new family of energetically efficient jumping mechanisms with optimized performance in the future.
Abstract:This study examines a biology-inspired approach of using reconfigurable articulation to reduce the control requirement for soft robotic arms. We construct a robotic arm by assembling Kresling origami modules that exhibit predictable bistability. Via switching between their two stable states, these origami modules can behave either like a flexible joint with low bending stiffness or like a stiff link with high stiffness, without requiring any continuous power supply. In this way, the robotic arm can exhibit pseudo-linkage kinematics with lower control requirements and improved motion accuracy. A unique advantage of using origami as the robotic arm skeleton is that its bending stiffness ratio between stable states is directly related to the underlying Kresling design. Therefore, we conduct extensive parametric analyses and experimental validations to identify the optimized Kresling pattern for articulation. The results indicate that a higher angle ratio, a smaller resting length at contracted stable state, and a large number of polygon sides can offer more significant and robust bending stiffness tuning. Based on this insight, we construct a proof-of-concept, tendon-driven robotic arm consisting of three modules, and show that it can exhibit the desired reconfigurable articulation behavior. Moreover, the deformations of this manipulator are consistent with kinematic model predictions, which validate the possibility of using simple controllers for such compliant robotic systems.
Abstract:Federated learning systems are vulnerable to attacks from malicious clients. As the central server in the system cannot govern the behaviors of the clients, a rogue client may initiate an attack by sending malicious model updates to the server, so as to degrade the learning performance or enforce targeted model poisoning attacks (a.k.a. backdoor attacks). Therefore, timely detecting these malicious model updates and the underlying attackers becomes critically important. In this work, we propose a new framework for robust federated learning where the central server learns to detect and remove the malicious model updates using a powerful detection model, leading to targeted defense. We evaluate our solution in both image classification and sentiment analysis tasks with a variety of machine learning models. Experimental results show that our solution ensures robust federated learning that is resilient to both the Byzantine attacks and the targeted model poisoning attacks.