Abstract:Road ponding, a prevalent traffic hazard, poses a serious threat to road safety by causing vehicles to lose control and leading to accidents ranging from minor fender benders to severe collisions. Existing technologies struggle to accurately identify road ponding due to complex road textures and variable ponding coloration influenced by reflection characteristics. To address this challenge, we propose a novel approach called Self-Attention-based Global Saliency-Enhanced Network (AGSENet) for proactive road ponding detection and traffic safety improvement. AGSENet incorporates saliency detection techniques through the Channel Saliency Information Focus (CSIF) and Spatial Saliency Information Enhancement (SSIE) modules. The CSIF module, integrated into the encoder, employs self-attention to highlight similar features by fusing spatial and channel information. The SSIE module, embedded in the decoder, refines edge features and reduces noise by leveraging correlations across different feature levels. To ensure accurate and reliable evaluation, we corrected significant mislabeling and missing annotations in the Puddle-1000 dataset. Additionally, we constructed the Foggy-Puddle and Night-Puddle datasets for road ponding detection in low-light and foggy conditions, respectively. Experimental results demonstrate that AGSENet outperforms existing methods, achieving IoU improvements of 2.03\%, 0.62\%, and 1.06\% on the Puddle-1000, Foggy-Puddle, and Night-Puddle datasets, respectively, setting a new state-of-the-art in this field. Finally, we verified the algorithm's reliability on edge computing devices. This work provides a valuable reference for proactive warning research in road traffic safety.
Abstract:Visual bird's eye view (BEV) perception, due to its excellent perceptual capabilities, is progressively replacing costly LiDAR-based perception systems, especially in the realm of urban intelligent driving. However, this type of perception still relies on LiDAR data to construct ground truth databases, a process that is both cumbersome and time-consuming. Moreover, most massproduced autonomous driving systems are only equipped with surround camera sensors and lack LiDAR data for precise annotation. To tackle this challenge, we propose a fine-tuning method for BEV perception network based on visual 2D semantic perception, aimed at enhancing the model's generalization capabilities in new scene data. Considering the maturity and development of 2D perception technologies, our method significantly reduces the dependency on high-cost BEV ground truths and shows promising industrial application prospects. Extensive experiments and comparative analyses conducted on the nuScenes and Waymo public datasets demonstrate the effectiveness of our proposed method.
Abstract:This study introduces the 4D Risk Occupancy within a vehicle-road-cloud architecture, integrating the road surface spatial, risk, and temporal dimensions, and endowing the algorithm with beyond-line-of-sight, all-angles, and efficient abilities. The algorithm simplifies risk modeling by focusing on directly observable information and key factors, drawing on the concept of Occupancy Grid Maps (OGM), and incorporating temporal prediction to effectively map current and future risk occupancy. Compared to conventional driving risk fields and grid occupancy maps, this algorithm can map global risks more efficiently, simply, and reliably. It can integrate future risk information, adapting to dynamic traffic environments. The 4D Risk Occupancy also unifies the expression of BEV detection and lane line detection results, enhancing the intuitiveness and unity of environmental perception. Using DAIR-V2X data, this paper validates the 4D Risk Occupancy algorithm and develops a local path planning model based on it. Qualitative experiments under various road conditions demonstrate the practicality and robustness of this local path planning model. Quantitative analysis shows that the path planning based on risk occupation significantly improves trajectory planning performance, increasing safety redundancy by 12.5% and reducing average deceleration by 5.41% at an initial braking speed of 8 m/s, thereby improving safety and comfort. This work provides a new global perception method and local path planning method through Vehicle-Road-Cloud architecture, offering a new perceptual paradigm for achieving safer and more efficient autonomous driving.
Abstract:The environmental perception of autonomous vehicles in normal conditions have achieved considerable success in the past decade. However, various unfavourable conditions such as fog, low-light, and motion blur will degrade image quality and pose tremendous threats to the safety of autonomous driving. That is, when applied to degraded images, state-of-the-art visual models often suffer performance decline due to the feature content loss and artifact interference caused by statistical and structural properties disruption of captured images. To address this problem, this work proposes a novel Deep Channel Prior (DCP) for degraded visual recognition. Specifically, we observe that, in the deep representation space of pre-trained models, the channel correlations of degraded features with the same degradation type have uniform distribution even if they have different content and semantics, which can facilitate the mapping relationship learning between degraded and clear representations in high-sparsity feature space. Based on this, a novel plug-and-play Unsupervised Feature Enhancement Module (UFEM) is proposed to achieve unsupervised feature correction, where the multi-adversarial mechanism is introduced in the first stage of UFEM to achieve the latent content restoration and artifact removal in high-sparsity feature space. Then, the generated features are transferred to the second stage for global correlation modulation under the guidance of DCP to obtain high-quality and recognition-friendly features. Evaluations of three tasks and eight benchmark datasets demonstrate that our proposed method can comprehensively improve the performance of pre-trained models in real degradation conditions. The source code is available at https://github.com/liyuhang166/Deep_Channel_Prior
Abstract:In this paper, we propose a second-order extension of the continuous-time game-theoretic mirror descent (MD) dynamics, referred to as MD2, which converges to mere (but not necessarily strict) variationally stable states (VSS) without using common auxiliary techniques such as averaging or discounting. We show that MD2 enjoys no-regret as well as exponential rate of convergence towards a strong VSS upon a slight modification. Furthermore, MD2 can be used to derive many novel primal-space dynamics. Lastly, using stochastic approximation techniques, we provide a convergence guarantee of discrete-time MD2 with noisy observations towards interior mere VSS. Selected simulations are provided to illustrate our results.
Abstract:In this paper, we consider concave continuous-kernel games characterized by monotonicity properties and propose discounted mirror descent-type dynamics. We introduce two classes of dynamics whereby the associated mirror map is constructed based on a strongly convex or a Legendre regularizer. Depending on the properties of the regularizer we show that these new dynamics can converge asymptotically in concave games with monotone (negative) pseudo-gradient. Furthermore, we show that when the regularizer enjoys strong convexity, the resulting dynamics can converge even in games with hypo-monotone (negative) pseudo-gradient, which corresponds to a shortage of monotonicity.
Abstract:In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the Lipschitz and co-coercivity properties of the softmax function. We then demonstrate the usefulness of these properties through an application in game-theoretic reinforcement learning.