Member, IEEE
Abstract:Autonomous drones are rapidly reshaping industries ranging from aerial delivery and infrastructure inspection to environmental monitoring and disaster response. Ensuring the safety, reliability, and efficiency of these systems is paramount as they transition from research prototypes to mission-critical platforms. This paper presents a step-by-step guide to establishing a robust autonomous drone testing pipeline, covering each critical stage: Software-in-the-Loop (SIL) Simulation Testing, Hardware-in-the-Loop (HIL) Testing, Controlled Real-World Testing, and In-Field Testing. Using practical examples, including the marker-based autonomous landing system, we demonstrate how to systematically verify drone system behaviors, identify integration issues, and optimize performance. Furthermore, we highlight emerging trends shaping the future of drone testing, including the integration of Neurosymbolic and LLMs, creating co-simulation environments, and Digital Twin-enabled simulation-based testing techniques. By following this pipeline, developers and researchers can achieve comprehensive validation, minimize deployment risks, and prepare autonomous drones for safe and reliable real-world operations.
Abstract:Uncrewed Aerial Vehicles (UAVs) have become a focal point of research, with both established companies and startups investing heavily in their development. This paper presents our iterative process in developing a robust autonomous marker-based landing system, highlighting the key challenges encountered and the solutions implemented. It reviews existing systems for autonomous landing processes, and through this aims to contribute to the community by sharing insights and challenges faced during development and testing.
Abstract:Autonomous cyber-physical systems (CPSs) leverage AI for perception, planning, and control but face trust and safety certification challenges due to inherent uncertainties. The neurosymbolic paradigm replaces stochastic layers with interpretable symbolic AI, enabling determinism. While promising, challenges like multisensor fusion, adaptability, and verification remain. This paper introduces NeuroStrata, a neurosymbolic framework to enhance the testing and verification of autonomous CPS. We outline its key components, present early results, and detail future plans.
Abstract:Autonomous Driving Systems (ADSs) are safety-critical, as real-world safety violations can result in significant losses. Rigorous testing is essential before deployment, with simulation testing playing a key role. However, ADSs are typically complex, consisting of multiple modules such as perception and planning, or well-trained end-to-end autonomous driving systems. Offline methods, such as the Genetic Algorithm (GA), can only generate predefined trajectories for dynamics, which struggle to cause safety violations for ADSs rapidly and efficiently in different scenarios due to their evolutionary nature. Online methods, such as single-agent reinforcement learning (RL), can quickly adjust the dynamics' trajectory online to adapt to different scenarios, but they struggle to capture complex corner cases of ADS arising from the intricate interplay among multiple vehicles. Multi-agent reinforcement learning (MARL) has a strong ability in cooperative tasks. On the other hand, it faces its own challenges, particularly with convergence. This paper introduces MARL-OT, a scalable framework that leverages MARL to detect safety violations of ADS resulting from surrounding vehicles' cooperation. MARL-OT employs MARL for high-level guidance, triggering various dangerous scenarios for the rule-based online fuzzer to explore potential safety violations of ADS, thereby generating dynamic, realistic safety violation scenarios. Our approach improves the detected safety violation rate by up to 136.2% compared to the state-of-the-art (SOTA) testing technique.
Abstract:As data emerges as a vital driver of technological and economic advancements, a key challenge is accurately quantifying its value in algorithmic decision-making. The Shapley value, a well-established concept from cooperative game theory, has been widely adopted to assess the contribution of individual data sources in supervised machine learning. However, its symmetry axiom assumes all players in the cooperative game are homogeneous, which overlooks the complex structures and dependencies present in real-world datasets. To address this limitation, we extend the traditional data Shapley framework to asymmetric data Shapley, making it flexible enough to incorporate inherent structures within the datasets for structure-aware data valuation. We also introduce an efficient $k$-nearest neighbor-based algorithm for its exact computation. We demonstrate the practical applicability of our framework across various machine learning tasks and data market contexts. The code is available at: https://github.com/xzheng01/Asymmetric-Data-Shapley.
Abstract:Vision-language pretraining (VLP) with transformers has demonstrated exceptional performance across numerous multimodal tasks. However, the adversarial robustness of these models has not been thoroughly investigated. Existing multimodal attack methods have largely overlooked cross-modal interactions between visual and textual modalities, particularly in the context of cross-attention mechanisms. In this paper, we study the adversarial vulnerability of recent VLP transformers and design a novel Joint Multimodal Transformer Feature Attack (JMTFA) that concurrently introduces adversarial perturbations in both visual and textual modalities under white-box settings. JMTFA strategically targets attention relevance scores to disrupt important features within each modality, generating adversarial samples by fusing perturbations and leading to erroneous model predictions. Experimental results indicate that the proposed approach achieves high attack success rates on vision-language understanding and reasoning downstream tasks compared to existing baselines. Notably, our findings reveal that the textual modality significantly influences the complex fusion processes within VLP transformers. Moreover, we observe no apparent relationship between model size and adversarial robustness under our proposed attacks. These insights emphasize a new dimension of adversarial robustness and underscore potential risks in the reliable deployment of multimodal AI systems.
Abstract:The integration of machine learning (ML) into cyber-physical systems (CPS) offers significant benefits, including enhanced efficiency, predictive capabilities, real-time responsiveness, and the enabling of autonomous operations. This convergence has accelerated the development and deployment of a range of real-world applications, such as autonomous vehicles, delivery drones, service robots, and telemedicine procedures. However, the software development life cycle (SDLC) for AI-infused CPS diverges significantly from traditional approaches, featuring data and learning as two critical components. Existing verification and validation techniques are often inadequate for these new paradigms. In this study, we pinpoint the main challenges in ensuring formal safety for learningenabled CPS.We begin by examining testing as the most pragmatic method for verification and validation, summarizing the current state-of-the-art methodologies. Recognizing the limitations in current testing approaches to provide formal safety guarantees, we propose a roadmap to transition from foundational probabilistic testing to a more rigorous approach capable of delivering formal assurance.
Abstract:Mutual localization stands as a foundational component within various domains of multi-robot systems. Nevertheless, in relative pose estimation, time synchronization is usually underappreciated and rarely addressed, although it significantly influences estimation accuracy. In this paper, we introduce time synchronization into mutual localization to recover the time offset and relative poses between robots simultaneously. Under a constant velocity assumption in a short time, we fuse time offset estimation with our previous bearing-based mutual localization by a novel error representation. Based on the error model, we formulate a joint optimization problem and utilize semi-definite relaxation (SDR) to furnish a lossless relaxation. By solving the relaxed problem, time synchronization and relative pose estimation can be achieved when time drift between robots is limited. To enhance the application range of time offset estimation, we further propose an iterative method to recover the time offset from coarse to fine. Comparisons between the proposed method and the existing ones through extensive simulation tests present prominent benefits of time synchronization on mutual localization. Moreover, real-world experiments are conducted to show the practicality and robustness.
Abstract:Brain-Computer Interfaces (BCIs) are a groundbreaking technology for interacting with external devices using brain signals. Despite advancements, electroencephalogram (EEG)-based Motor Imagery (MI) tasks face challenges like amplitude and phase variability, and complex spatial correlations, with a need for smaller model size and faster inference. This study introduces the LGL-BCI framework, employing a Geometric Deep Learning Framework for EEG processing in non-Euclidean metric spaces, particularly the Symmetric Positive Definite (SPD) Manifold space. LGL-BCI offers robust EEG data representation and captures spatial correlations. We propose an EEG channel selection solution via a feature decomposition algorithm to reduce SPD matrix dimensionality, with a lossless transformation boosting inference speed. Extensive experiments show LGL-BCI's superior accuracy and efficiency compared to current solutions, highlighting geometric deep learning's potential in MI-BCI applications. The efficiency, assessed on two public EEG datasets and two real-world EEG devices, significantly outperforms the state-of-the-art solution in accuracy ($82.54\%$ versus $62.22\%$) with fewer parameters (64.9M compared to 183.7M).
Abstract:Vulnerability detection is a critical problem in software security and attracts growing attention both from academia and industry. Traditionally, software security is safeguarded by designated rule-based detectors that heavily rely on empirical expertise, requiring tremendous effort from software experts to generate rule repositories for large code corpus. Recent advances in deep learning, especially Graph Neural Networks (GNN), have uncovered the feasibility of automatic detection of a wide range of software vulnerabilities. However, prior learning-based works only break programs down into a sequence of word tokens for extracting contextual features of codes, or apply GNN largely on homogeneous graph representation (e.g., AST) without discerning complex types of underlying program entities (e.g., methods, variables). In this work, we are one of the first to explore heterogeneous graph representation in the form of Code Property Graph and adapt a well-known heterogeneous graph network with a dual-supervisor structure for the corresponding graph learning task. Using the prototype built, we have conducted extensive experiments on both synthetic datasets and real-world projects. Compared with the state-of-the-art baselines, the results demonstrate promising effectiveness in this research direction in terms of vulnerability detection performance (average F1 improvements over 10\% in real-world projects) and transferability from C/C++ to other programming languages (average F1 improvements over 11%).