Abstract:Large language models (LLMs) introduce new security risks, but there are few comprehensive evaluation suites to measure and reduce these risks. We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities. We introduce two new areas for testing: prompt injection and code interpreter abuse. We evaluated multiple state-of-the-art (SOTA) LLMs, including GPT-4, Mistral, Meta Llama 3 70B-Instruct, and Code Llama. Our results show that conditioning away risk of attack remains an unsolved problem; for example, all tested models showed between 26% and 41% successful prompt injection tests. We further introduce the safety-utility tradeoff: conditioning an LLM to reject unsafe prompts can cause the LLM to falsely reject answering benign prompts, which lowers utility. We propose quantifying this tradeoff using False Refusal Rate (FRR). As an illustration, we introduce a novel test set to quantify FRR for cyberattack helpfulness risk. We find many LLMs able to successfully comply with "borderline" benign requests while still rejecting most unsafe requests. Finally, we quantify the utility of LLMs for automating a core cybersecurity task, that of exploiting software vulnerabilities. This is important because the offensive capabilities of LLMs are of intense interest; we quantify this by creating novel test sets for four representative problems. We find that models with coding capabilities perform better than those without, but that further work is needed for LLMs to become proficient at exploit generation. Our code is open source and can be used to evaluate other LLMs.
Abstract:Recent work has shown that 3D Gaussian-based SLAM enables high-quality reconstruction, accurate pose estimation, and real-time rendering of scenes. However, these approaches are built on a tremendous number of redundant 3D Gaussian ellipsoids, leading to high memory and storage costs, and slow training speed. To address the limitation, we propose a compact 3D Gaussian Splatting SLAM system that reduces the number and the parameter size of Gaussian ellipsoids. A sliding window-based masking strategy is first proposed to reduce the redundant ellipsoids. Then we observe that the covariance matrix (geometry) of most 3D Gaussian ellipsoids are extremely similar, which motivates a novel geometry codebook to compress 3D Gaussian geometric attributes, i.e., the parameters. Robust and accurate pose estimation is achieved by a global bundle adjustment method with reprojection loss. Extensive experiments demonstrate that our method achieves faster training and rendering speed while maintaining the state-of-the-art (SOTA) quality of the scene representation.
Abstract:Soft actuators have drawn significant attention from researchers with an inherently compliant design to address the safety issues in physical human-robot interactions. However, they are also vulnerable and pose new challenges in the design, fabrication, and analysis due to their inherent material softness. In this paper, a novel hybrid actuator design is presented with bio-inspirations from the lobster, or crustaceans in a broader perspective. We enclose a soft chamber with rectangular cross-section using a series of articulated rigid shells to produce bending under pneumatic input. By mimicking the shell pattern of lobsters' abdomen, foldable rigid shells are designed to provide the soft actuator with full protection throughout the motion range. The articulation of the rigid shells predefines the actuator's bending motions. As a result, the proposed design enables one to analyze this hybrid actuator with simplified quasi-static models and rigid-body kinematics, which are further validated by mechanical tests. This paper demonstrates that the proposed hybrid actuator design is capable of bridging the major design drawbacks of the entirely rigid and soft robots while preserving their engineering merits in performance.
Abstract:Classical rigid-bodied robotic systems are presented with proven success in theoretical development and industrial applications, are recently challenged by the emergence of soft robotics due to a growing need in physical human-robot interactions (pHRI), such as wearable devices, medical robots, personal robots, etc. In this paper, we present the design and fabrication of a robust, hybrid bending actuator build from both rigid and soft components inspired by crustaceans, where its bending radius and axis can be mechanically programmed through the selective activation of the rigid exterior joints, actuated by the soft actuators inside. The hybrid actuator was experimentally measured in terms of bending and force tests to demonstrate the utility of this design. Finally, a case study was presented to demonstrate its capacity to adapt to specific objects geometry, anticipating its potential application in situations where compliance is the priority.
Abstract:This paper presents preliminary results of the design, development, and evaluation of a hand rehabilitation glove fabricated using lobster-inspired hybrid design with rigid and soft components for actuation. Inspired by the bending abdomen of lobsters, hybrid actuators are built with serially jointed rigid shells actuated by pressurized soft chambers inside to generate bending motions. Such bio-inspiration absorbs features from the classical rigid-bodied robotics with precisely-defined motion generation, as well as the emerging soft robotics with light-weight, physically safe, and adaptive actuation. The fabrication procedure is described, followed by experiments to mechanically characterize these actuators. Finally, an open-palm glove design integrated with these hybrid actuators is presented for a qualitative case study. A hand rehabilitation system is developed by learning patterns of the sEMG signals from the user's forearm to train the assistive glove for hand rehabilitation exercises.
Abstract:Seed scheduling is a prominent factor in determining the yields of hybrid fuzzing. Existing hybrid fuzzers schedule seeds based on fixed heuristics that aim to predict input utilities. However, such heuristics are not generalizable as there exists no one-size-fits-all rule applicable to different programs. They may work well on the programs from which they were derived, but not others. To overcome this problem, we design a Machine learning-Enhanced hybrid fUZZing system (MEUZZ), which employs supervised machine learning for adaptive and generalizable seed scheduling. MEUZZ determines which new seeds are expected to produce better fuzzing yields based on the knowledge learned from past seed scheduling decisions made on the same or similar programs. MEUZZ's learning is based on a series of features extracted via code reachability and dynamic analysis, which incurs negligible runtime overhead (in microseconds). Moreover, MEUZZ automatically infers the data labels by evaluating the fuzzing performance of each selected seed. As a result, MEUZZ is generally applicable to, and performs well on, various kinds of programs. Our evaluation shows MEUZZ significantly outperforms the state-of-the-art grey-box and hybrid fuzzers, achieving 27.1% more code coverage than QSYM. The learned models are reusable and transferable, which boosts fuzzing performance by 7.1% on average and improves 68% of the 56 cross-program fuzzing campaigns. MEUZZ discovered 47 deeply hidden and previously unknown bugs--with 21 confirmed and fixed by the developers--when fuzzing 8 well-tested programs with the same configurations as used in previous work.