Abstract:Roadside perception systems are increasingly crucial in enhancing traffic safety and facilitating cooperative driving for autonomous vehicles. Despite rapid technological advancements, a major challenge persists for this newly arising field: the absence of standardized evaluation methods and benchmarks for these systems. This limitation hampers the ability to effectively assess and compare the performance of different systems, thus constraining progress in this vital field. This paper introduces a comprehensive evaluation methodology specifically designed to assess the performance of roadside perception systems. Our methodology encompasses measurement techniques, metric selection, and experimental trial design, all grounded in real-world field testing to ensure the practical applicability of our approach. We applied our methodology in Mcity\footnote{\url{https://mcity.umich.edu/}}, a controlled testing environment, to evaluate various off-the-shelf perception systems. This approach allowed for an in-depth comparative analysis of their performance in realistic scenarios, offering key insights into their respective strengths and limitations. The findings of this study are poised to inform the development of industry-standard benchmarks and evaluation methods, thereby enhancing the effectiveness of roadside perception system development and deployment for autonomous vehicles. We anticipate that this paper will stimulate essential discourse on standardizing evaluation methods for roadside perception systems, thus pushing the frontiers of this technology. Furthermore, our results offer both academia and industry a comprehensive understanding of the capabilities of contemporary infrastructure-based perception systems.
Abstract:As vehicular communication and networking technologies continue to advance, infrastructure-based roadside perception emerges as a pivotal tool for connected automated vehicle (CAV) applications. Due to their elevated positioning, roadside sensors, including cameras and lidars, often enjoy unobstructed views with diminished object occlusion. This provides them a distinct advantage over onboard perception, enabling more robust and accurate detection of road objects. This paper presents MSight, a cutting-edge roadside perception system specifically designed for CAVs. MSight offers real-time vehicle detection, localization, tracking, and short-term trajectory prediction. Evaluations underscore the system's capability to uphold lane-level accuracy with minimal latency, revealing a range of potential applications to enhance CAV safety and efficiency. Presently, MSight operates 24/7 at a two-lane roundabout in the City of Ann Arbor, Michigan.
Abstract:Recently, with the rapid development in vehicle-to-infrastructure communication technologies, the infrastructure-based, roadside perception system for cooperative driving has become a rising field. This paper focuses on one of the most critical challenges - the data-insufficiency problem. The lacking of high-quality labeled roadside sensor data with high diversity leads to low robustness, and low transfer-ability of current roadside perception systems. In this paper, a novel approach is proposed to address this problem by creating synthesized training data using Augmented Reality and Generative Adversarial Network. This method creates synthesized dataset that is capable of training or fine-tuning a roadside perception detector which is robust to different weather and lighting conditions, or to adapt a new deployment location. We validate our approach at two intersections: Mcity intersection and State St/Ellsworth Rd roundabout. Our experiments show that (1) the detector can achieve good performance in all conditions when trained on synthesized data only, and (2) the performance of an existing detector trained with labeled data can be enhanced by synthesized data in harsh conditions.
Abstract:Traffic conflicts have been studied by the transportation research community as a surrogate safety measure for decades. However, due to the rarity of traffic conflicts, collecting large-scale real-world traffic conflict data becomes extremely challenging. In this paper, we introduce and analyze ROCO - a real-world roundabout traffic conflict dataset. The data is collected at a two-lane roundabout at the intersection of State St. and W. Ellsworth Rd. in Ann Arbor, Michigan. We use raw video dataflow captured from four fisheye cameras installed at the roundabout as our input data source. We adopt a learning-based conflict identification algorithm from video to find potential traffic conflicts, and then manually label them for dataset collection and annotation. In total 557 traffic conflicts and 17 traffic crashes are collected from August 2021 to October 2021. We provide trajectory data of the traffic conflict scenes extracted using our roadside perception system. Taxonomy based on traffic conflict severity, reason for the traffic conflict, and its effect on the traffic flow is provided. With the traffic conflict data collected, we discover that failure to yield to circulating vehicles when entering the roundabout is the largest contributing reason for traffic conflicts. ROCO dataset will be made public in the short future.
Abstract:We propose a novel and pragmatic framework for traffic scene perception with roadside cameras. The proposed framework covers a full-stack of roadside perception pipeline for infrastructure-assisted autonomous driving, including object detection, object localization, object tracking, and multi-camera information fusion. Unlike previous vision-based perception frameworks rely upon depth offset or 3D annotation at training, we adopt a modular decoupling design and introduce a landmark-based 3D localization method, where the detection and localization can be well decoupled so that the model can be easily trained based on only 2D annotations. The proposed framework applies to either optical or thermal cameras with pinhole or fish-eye lenses. Our framework is deployed at a two-lane roundabout located at Ellsworth Rd. and State St., Ann Arbor, MI, USA, providing 7x24 real-time traffic flow monitoring and high-precision vehicle trajectory extraction. The whole system runs efficiently on a low-power edge computing device with all-component end-to-end delay of less than 20ms.