Abstract:Safety-critical traffic scenarios are of great practical relevance to evaluating the robustness of autonomous driving (AD) systems. Given that these long-tail events are extremely rare in real-world traffic data, there is a growing body of work dedicated to the automatic traffic scenario generation. However, nearly all existing algorithms for generating safety-critical scenarios rely on snippets of previously recorded traffic events, transforming normal traffic flow into accident-prone situations directly. In other words, safety-critical traffic scenario generation is hindsight and not applicable to newly encountered and open-ended traffic events.In this paper, we propose the Deep Motion Factorization (DeepMF) framework, which extends static safety-critical driving scenario generation to closed-loop and interactive adversarial traffic simulation. DeepMF casts safety-critical traffic simulation as a Bayesian factorization that includes the assignment of hazardous traffic participants, the motion prediction of selected opponents, the reaction estimation of autonomous vehicle (AV) and the probability estimation of the accident occur. All the aforementioned terms are calculated using decoupled deep neural networks, with inputs limited to the current observation and historical states. Consequently, DeepMF can effectively and efficiently simulate safety-critical traffic scenarios at any triggered time and for any duration by maximizing the compounded posterior probability of traffic risk. Extensive experiments demonstrate that DeepMF excels in terms of risk management, flexibility, and diversity, showcasing outstanding performance in simulating a wide range of realistic, high-risk traffic scenarios.
Abstract:Recent advancements in AI alignment techniques have significantly improved the alignment of large language models (LLMs) with static human preferences. However, the dynamic nature of human preferences can render some prior training data outdated or even erroneous, ultimately causing LLMs to deviate from contemporary human preferences and societal norms. Existing methodologies, whether they involve the curation of new data for continual alignment or the manual correction of outdated data for re-alignment, demand costly human resources. To address this challenge, we propose a novel approach, Large Language Model Behavior Correction with Influence Function Recall and Post-Training (LANCET), which requires no human involvement. LANCET consists of two phases: (1) using influence functions to identify the training data that significantly impact undesirable model outputs, and (2) applying an Influence function-driven Bregman Optimization (IBO) technique to adjust the model's behavior based on these influence distributions. Our experiments demonstrate that LANCET effectively and efficiently correct inappropriate behaviors of LLMs. Furthermore, LANCET can outperform methods that rely on collecting human preferences, and it enhances the interpretability of learning human preferences within LLMs.
Abstract:This paper explores the ability of Graph Neural Networks (GNNs) in learning various forms of information for link prediction, alongside a brief review of existing link prediction methods. Our analysis reveals that GNNs cannot effectively learn structural information related to the number of common neighbors between two nodes, primarily due to the nature of set-based pooling of the neighborhood aggregation scheme. Also, our extensive experiments indicate that trainable node embeddings can improve the performance of GNN-based link prediction models. Importantly, we observe that the denser the graph, the greater such the improvement. We attribute this to the characteristics of node embeddings, where the link state of each link sample could be encoded into the embeddings of nodes that are involved in the neighborhood aggregation of the two nodes in that link sample. In denser graphs, every node could have more opportunities to attend the neighborhood aggregation of other nodes and encode states of more link samples to its embedding, thus learning better node embeddings for link prediction. Lastly, we demonstrate that the insights gained from our research carry important implications in identifying the limitations of existing link prediction methods, which could guide the future development of more robust algorithms.
Abstract:For tasks conducted in unknown environments with efficiency requirements, real-time navigation of multi-robot systems remains challenging due to unfamiliarity with surroundings.In this paper, we propose a novel multi-robot collaborative planning method that leverages the perception of different robots to intelligently select search directions and improve planning efficiency. Specifically, a foundational planner is employed to ensure reliable exploration towards targets in unknown environments and we introduce Graph Attention Architecture with Information Gain Weight(GIWT) to synthesizes the information from the target robot and its teammates to facilitate effective navigation around obstacles.In GIWT, after regionally encoding the relative positions of the robots along with their perceptual features, we compute the shared attention scores and incorporate the information gain obtained from neighboring robots as a supplementary weight. We design a corresponding expert data generation scheme to simulate real-world decision-making conditions for network training. Simulation experiments and real robot tests demonstrates that the proposed method significantly improves efficiency and enables collaborative planning for multiple robots. Our method achieves approximately 82% accuracy on the expert dataset and reduces the average path length by about 8% and 6% across two types of tasks compared to the fundamental planner in ROS tests, and a path length reduction of over 6% in real-world experiments.
Abstract:Stationary balance control is challenging for single-track two-wheeled (STTW) robots due to the lack of elegant balancing mechanisms and the conflict between the limited attraction domain and external disturbances. To address the absence of balancing mechanisms, we draw inspiration from cyclists and leverage the track stand maneuver, which relies solely on steering and rear-wheel actuation. To achieve accurate tracking in the presence of matched and mismatched disturbances, we propose an equilibrium adaptation-based control (EABC) scheme that can be seamlessly integrated with standard disturbance observers and controllers. This scheme enables adaptation to slow-varying disturbances by utilizing a disturbed equilibrium estimator, effectively handling both matched and mismatched disturbances in a unified manner while ensuring accurate tracking with zero steady-state error. We integrate the EABC scheme with nonlinear model predictive control (MPC) for the track stand of STTW robots and validate its effectiveness through two experimental scenarios. Our method demonstrates significant improvements in tracking accuracy, reducing errors by several orders of magnitude.
Abstract:The tendency of Large Language Models (LLMs) to generate hallucinations raises concerns regarding their reliability. Therefore, confidence estimations indicating the extent of trustworthiness of the generations become essential. However, current LLM confidence estimations in languages other than English remain underexplored. This paper addresses this gap by introducing a comprehensive investigation of Multilingual Confidence estimation (MlingConf) on LLMs, focusing on both language-agnostic (LA) and language-specific (LS) tasks to explore the performance and language dominance effects of multilingual confidence estimations on different tasks. The benchmark comprises four meticulously checked and human-evaluate high-quality multilingual datasets for LA tasks and one for the LS task tailored to specific social, cultural, and geographical contexts of a language. Our experiments reveal that on LA tasks English exhibits notable linguistic dominance in confidence estimations than other languages, while on LS tasks, using question-related language to prompt LLMs demonstrates better linguistic dominance in multilingual confidence estimations. The phenomena inspire a simple yet effective native-tone prompting strategy by employing language-specific prompts for LS tasks, effectively improving LLMs' reliability and accuracy on LS tasks.
Abstract:This letter presents a model to address the collaborative effects in multi-agent systems from the perspective of microscopic mechanism. The model utilizes distributed control for robot swarms in traversal applications. Inspired by pedestrian planning dynamics, the model employs three types of forces to regulate the behavior of agents: intrinsic propulsion, interaction among agents, and repulsion from obstacles. These forces are able to balance the convergence, divergence and avoidance effects among agents. Additionally, we present a planning and decision method based on resultant forces to enable real-world deployment of the model. Experimental results demonstrate the effectiveness on system path optimization in unknown cluttered environments. The sensor data is swiftly digital filtered and the data transmitted is significantly compressed. Consequently, the model has low computation costs and minimal communication loads, thereby promoting environmental adaptability and system scalability.
Abstract:This paper presents Range-SLAM, a real-time, lightweight SLAM system designed to address the challenges of localization and mapping in environments with smoke and other harsh conditions using Ultra-Wideband (UWB) signals. While optical sensors like LiDAR and cameras struggle in low-visibility environments, UWB signals provide a robust alternative for real-time positioning. The proposed system uses general UWB devices to achieve accurate mapping and localization without relying on expensive LiDAR or other dedicated hardware. By utilizing only the distance and Received Signal Strength Indicator (RSSI) provided by UWB sensors in relation to anchors, we combine the motion of the tag-carrying agent with raycasting algorithm to construct a 2D occupancy grid map in real time. To enhance localization in challenging conditions, a Weighted Least Squares (WLS) method is employed. Extensive real-world experiments, including smoke-filled environments and simulated
Abstract:Trajectory planning for teleoperated space manipulators involves challenges such as accurately modeling system dynamics, particularly in free-floating modes with non-holonomic constraints, and managing time delays that increase model uncertainty and affect control precision. Traditional teleoperation methods rely on precise dynamic models requiring complex parameter identification and calibration, while data-driven methods do not require prior knowledge but struggle with time delays. A novel framework utilizing deep reinforcement learning (DRL) is introduced to address these challenges. The framework incorporates three methods: Mapping, Prediction, and State Augmentation, to handle delays when delayed state information is received at the master end. The Soft Actor Critic (SAC) algorithm processes the state information to compute the next action, which is then sent to the remote manipulator for environmental interaction. Four environments are constructed using the MuJoCo simulation platform to account for variations in base and target fixation: fixed base and target, fixed base with rotated target, free-floating base with fixed target, and free-floating base with rotated target. Extensive experiments with both constant and random delays are conducted to evaluate the proposed methods. Results demonstrate that all three methods effectively address trajectory planning challenges, with State Augmentation showing superior efficiency and robustness.
Abstract:This paper studies the problem of multi-agent trajectory prediction in crowded unknown environments. A novel energy function optimization-based framework is proposed to generate prediction trajectories. Firstly, a new energy function is designed for easier optimization. Secondly, an online optimization pipeline for calculating parameters and agents' velocities is developed. In this pipeline, we first design an efficient group division method based on Frechet distance to classify agents online. Then the strategy on decoupling the optimization of velocities and critical parameters in the energy function is developed, where the the slap swarm algorithm and gradient descent algorithms are integrated to solve the optimization problems more efficiently. Thirdly, we propose a similarity-based resample evaluation algorithm to predict agents' optimal goals, defined as the target-moving headings of agents, which effectively extracts hidden information in observed states and avoids learning agents' destinations via the training dataset in advance. Experiments and comparison studies verify the advantages of the proposed method in terms of prediction accuracy and speed.