Abstract:This paper presents a probabilistic generalization of the generalized optimal subpattern assignment (GOSPA) metric, termed P-GOSPA metric. GOSPA is a popular metric for evaluating the distance between finite sets, typically in multi-object estimation applications. P-GOSPA extends GOSPA to the space of multi-Bernoulli set densities, incorporating the inherent uncertainty in probabilistic multi-object representations. In addition, P-GOSPA retains the interpretability of GOSPA, such as decomposability into localization, missed and false detection errors, in a sound manner. Examples and simulations are presented to demonstrate the efficacy of P-GOSPA.
Abstract:Multi-object tracking algorithms are deployed in various applications, each with unique performance requirements. For example, track switches pose significant challenges for offline scene understanding, as they hinder the accuracy of data interpretation. Conversely, in online surveillance applications, their impact is often minimal. This disparity underscores the need for application-specific performance evaluations that are both simple and mathematically sound. The trajectory generalized optimal sub-pattern assignment (TGOSPA) metric offers a principled approach to evaluate multi-object tracking performance. It accounts for localization errors, the number of missed and false objects, and the number of track switches, providing a comprehensive assessment framework. This paper illustrates the effective use of the TGOSPA metric in computer vision tasks, addressing challenges posed by the need for application-specific scoring methodologies. By exploring the TGOSPA parameter selection, we enable users to compare, comprehend, and optimize the performance of algorithms tailored for specific tasks, such as target tracking and training of detector or re-ID modules.
Abstract:The concept of 6G distributed integrated sensing and communications (DISAC) builds upon the functionality of integrated sensing and communications (ISAC) by integrating distributed architectures, significantly enhancing both sensing and communication coverage and performance. In 6G DISAC systems, tracking target trajectories requires base stations (BSs) to hand over their tracked targets to neighboring BSs. Determining what information to share, where, how, and when is critical to effective handover. This paper addresses the target handover challenge in DISAC systems and introduces a method enabling BSs to share essential target trajectory information at appropriate time steps, facilitating seamless handovers to other BSs. The target tracking problem is tackled using the standard trajectory Poisson multi-Bernoulli mixture (TPMBM) filter, enhanced with the proposed handover algorithm. Simulation results confirm the effectiveness of the implemented tracking solution.
Abstract:The probability hypothesis density (PHD) and Poisson multi-Bernoulli (PMB) filters are two popular set-type multi-object filters. Motivated by the fact that the multi-object filtering density after each update step in the PHD filter is a PMB without approximation, in this paper we present a multi-object smoother involving PHD forward filtering and PMB backward smoothing. This is achieved by first running the PHD filtering recursion in the forward pass and extracting the PMB filtering densities after each update step before the Poisson Point Process approximation, which is inherent in the PHD filter update. Then in the backward pass we apply backward simulation for sets of trajectories to the extracted PMB filtering densities. We call the resulting multi-object smoother hybrid PHD-PMB trajectory smoother. Notably, the hybrid PHD-PMB trajectory smoother can provide smoothed trajectory estimates for the PHD filter without labeling or tagging, which is not possible for existing PHD smoothers. Also, compared to the trajectory PHD filter, which can only estimate alive trajectories, the hybrid PHD-PMB trajectory smoother enables the estimation of the set of all trajectories. Simulation results demonstrate that the hybrid PHD-PMB trajectory smoother outperforms the PHD filter in terms of both state and cardinality estimates, and the trajectory PHD filter in terms of false detections.
Abstract:Simultaneous localization and mapping (SLAM) methods need to both solve the data association (DA) problem and the joint estimation of the sensor trajectory and the map, conditioned on a DA. In this paper, we propose a novel integrated approach to solve both the DA problem and the batch SLAM problem simultaneously, combining random finite set (RFS) theory and the graph-based SLAM approach. A sampling method based on the Poisson multi-Bernoulli mixture (PMBM) density is designed for dealing with the DA uncertainty, and a graph-based SLAM solver is applied for the conditional SLAM problem. In the end, a post-processing approach is applied to merge SLAM results from different iterations. Using synthetic data, it is demonstrated that the proposed SLAM approach achieves performance close to the posterior Cram\'er-Rao bound, and outperforms state-of-the-art RFS-based SLAM filters in high clutter and high process noise scenarios.
Abstract:Accurate and timely determination of a vehicle's current lane within a map is a critical task in autonomous driving systems. This paper utilizes an Early Time Series Classification (ETSC) method to achieve precise and rapid ego-lane identification in real-world driving data. The method begins by assessing the similarities between map and lane markings perceived by the vehicle's camera using measurement model quality metrics. These metrics are then fed into a selected ETSC method, comprising a probabilistic classifier and a tailored trigger function, optimized via multi-objective optimization to strike a balance between early prediction and accuracy. Our solution has been evaluated on a comprehensive dataset consisting of 114 hours of real-world traffic data, collected across 5 different countries by our test vehicles. Results show that by leveraging road lane-marking geometry and lane-marking type derived solely from a camera, our solution achieves an impressive accuracy of 99.6%, with an average prediction time of only 0.84 seconds.
Abstract:High-definition map with accurate lane-level information is crucial for autonomous driving, but the creation of these maps is a resource-intensive process. To this end, we present a cost-effective solution to create lane-level roadmaps using only the global navigation satellite system (GNSS) and a camera on customer vehicles. Our proposed solution utilizes a prior standard-definition (SD) map, GNSS measurements, visual odometry, and lane marking edge detection points, to simultaneously estimate the vehicle's 6D pose, its position within a SD map, and also the 3D geometry of traffic lines. This is achieved using a Bayesian simultaneous localization and multi-object tracking filter, where the estimation of traffic lines is formulated as a multiple extended object tracking problem, solved using a trajectory Poisson multi-Bernoulli mixture (TPMBM) filter. In TPMBM filtering, traffic lines are modeled using B-spline trajectories, and each trajectory is parameterized by a sequence of control points. The proposed solution has been evaluated using experimental data collected by a test vehicle driving on highway. Preliminary results show that the traffic line estimates, overlaid on the satellite image, generally align with the lane markings up to some lateral offsets.
Abstract:Roadside perception is a key component in intelligent transportation systems. In this paper, we present a novel three-dimensional (3D) extended object tracking (EOT) method, which simultaneously estimates the object kinematics and extent state, in roadside perception using both the radar and camera data. Because of the influence of sensor viewing angle and limited angle resolution, radar measurements from objects are sparse and non-uniformly distributed, leading to inaccuracies in object extent and position estimation. To address this problem, we present a novel spherical Gaussian function weighted Gaussian mixture model. This model assumes that radar measurements originate from a series of probabilistic weighted radar reflectors on the vehicle's extent. Additionally, we utilize visual detection of vehicle keypoints to provide additional information on the positions of radar reflectors. Since keypoints may not always correspond to radar reflectors, we propose an elastic skeleton fusion mechanism, which constructs a virtual force to establish the relationship between the radar reflectors on the vehicle and its extent. Furthermore, to better describe the kinematic state of the vehicle and constrain its extent state, we develop a new 3D constant turn rate and velocity motion model, considering the complex 3D motion of the vehicle relative to the roadside sensor. Finally, we apply variational Bayesian approximation to the intractable measurement update step to enable recursive Bayesian estimation of the object's state. Simulation results using the Carla simulator and experimental results on the nuScenes dataset demonstrate the effectiveness and superiority of the proposed method in comparison to several state-of-the-art 3D EOT methods.
Abstract:Multiple extended target tracking (ETT) has gained increasing attention due to the development of high-precision LiDAR and radar sensors in automotive applications. For LiDAR point cloud-based vehicle tracking, this paper presents a probabilistic measurement-region association (PMRA) ETT model, which can describe the complex measurement distribution by partitioning the target extent into different regions. The PMRA model overcomes the drawbacks of previous data-region association (DRA) models by eliminating the approximation error of constrained estimation and using continuous integrals to more reliably calculate the association probabilities. Furthermore, the PMRA model is integrated with the Poisson multi-Bernoulli mixture (PMBM) filter for tracking multiple vehicles. Simulation results illustrate the superior estimation accuracy of the proposed PMRA-PMBM filter in terms of both positions and extents of the vehicles comparing with PMBM filters using the gamma Gaussian inverse Wishart and DRA implementations.
Abstract:Few-shot segmentation aims to train a segmentation model that can fast adapt to a novel task for which only a few annotated images are provided. Most recent models have adopted a prototype-based paradigm for few-shot inference. These approaches may have limited generalization capacity beyond the standard 1- or 5-shot settings. In this paper, we closely examine and reevaluate the fine-tuning based learning scheme that fine-tunes the classification layer of a deep segmentation network pre-trained on diverse base classes. To improve the generalizability of the classification layer optimized with sparsely annotated samples, we introduce an instance-aware data augmentation (IDA) strategy that augments the support images based on the relative sizes of the target objects. The proposed IDA effectively increases the support set's diversity and promotes the distribution consistency between support and query images. On the other hand, the large visual difference between query and support images may hinder knowledge transfer and cripple the segmentation performance. To cope with this challenge, we introduce the local consensus guided cross attention (LCCA) to align the query feature with support features based on their dense correlation, further improving the model's generalizability to the query image. The significant performance improvements on the standard few-shot segmentation benchmarks PASCAL-$5^i$ and COCO-$20^i$ verify the efficacy of our proposed method.