Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ding Yuan

Driving by Hybrid Navigation: An Online HD-SD Map Association Framework and Benchmark for Autonomous Vehicles

Jul 10, 2025

Jiaxu Wan, Xu Wang, Mengwei Xie, Xinyuan Chang, Xinran Liu, Zheng Pan, Mu Xu, Ding Yuan

Abstract:Autonomous vehicles rely on global standard-definition (SD) maps for road-level route planning and online local high-definition (HD) maps for lane-level navigation. However, recent work concentrates on construct online HD maps, often overlooking the association of global SD maps with online HD maps for hybrid navigation, making challenges in utilizing online HD maps in the real world. Observing the lack of the capability of autonomous vehicles in navigation, we introduce \textbf{O}nline \textbf{M}ap \textbf{A}ssociation, the first benchmark for the association of hybrid navigation-oriented online maps, which enhances the planning capabilities of autonomous vehicles. Based on existing datasets, the OMA contains 480k of roads and 260k of lane paths and provides the corresponding metrics to evaluate the performance of the model. Additionally, we propose a novel framework, named Map Association Transformer, as the baseline method, using path-aware attention and spatial attention mechanisms to enable the understanding of geometric and topological correspondences. The code and dataset can be accessed at https://github.com/WallelWan/OMA-MAT.

* 23 pages, 10 figures, 9 tables

Via

Access Paper or Ask Questions

SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Dec 16, 2024

Jiaxu Wan, Hong Zhang, Ziqi He, Qishu Wang, Ding Yuan, Yifan Yang

Figure 1 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Figure 2 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Figure 3 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Figure 4 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Abstract:In 3D understanding, point transformers have yielded significant advances in broadening the receptive field. However, further enhancement of the receptive field is hindered by the constraints of grouping attention. The proxy-based model, as a hot topic in image and language feature extraction, uses global or local proxies to expand the model's receptive field. But global proxy-based methods fail to precisely determine proxy positions and are not suited for tasks like segmentation and detection in the point cloud, and exist local proxy-based methods for image face difficulties in global-local balance, proxy sampling in various point clouds, and parallel cross-attention computation for sparse association. In this paper, we present SP$^2$T, a local proxy-based dual stream point transformer, which promotes global receptive field while maintaining a balance between local and global information. To tackle robust 3D proxy sampling, we propose a spatial-wise proxy sampling with vertex-based point proxy associations, ensuring robust point-cloud sampling in many scales of point cloud. To resolve economical association computation, we introduce sparse proxy attention combined with table-based relative bias, which enables low-cost and precise interactions between proxy and point features. Comprehensive experiments across multiple datasets reveal that our model achieves SOTA performance in downstream tasks. The code has been released in https://github.com/TerenceWallel/Sparse-Proxy-Point-Transformer .

* 13 pages, 14 figures, 14 tables

Via

Access Paper or Ask Questions

Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy

Aug 22, 2024

Hong Zhang, Yixuan Lyu, Qian Yu, Hanyang Liu, Huimin Ma, Ding Yuan, Yifan Yang

Figure 1 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy

Figure 2 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy

Figure 3 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy

Figure 4 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy

Abstract:In the domain of Camouflaged Object Segmentation (COS), despite continuous improvements in segmentation performance, the underlying mechanisms of effective camouflage remain poorly understood, akin to a black box. To address this gap, we present the first comprehensive study to examine the impact of camouflage attributes on the effectiveness of camouflage patterns, offering a quantitative framework for the evaluation of camouflage designs. To support this analysis, we have compiled the first dataset comprising descriptions of camouflaged objects and their attribute contributions, termed COD-Text And X-attributions (COD-TAX). Moreover, drawing inspiration from the hierarchical process by which humans process information: from high-level textual descriptions of overarching scenarios, through mid-level summaries of local areas, to low-level pixel data for detailed analysis. We have developed a robust framework that combines textual and visual information for the task of COS, named Attribution CUe Modeling with Eye-fixation Network (ACUMEN). ACUMEN demonstrates superior performance, outperforming nine leading methods across three widely-used datasets. We conclude by highlighting key insights derived from the attributes identified in our study. Code: https://github.com/lyu-yx/ACUMEN.

* Accepted by ECCV 2024

Via

Access Paper or Ask Questions

Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network

Oct 23, 2021

Tianqi Shen, Hong Zhang, Ding Yuan, Jiaping Xiao, Yifan Yang

Figure 1 for Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network

Figure 2 for Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network

Figure 3 for Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network

Figure 4 for Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network

Abstract:Vital importance has necessity to be attached to cooperation in multi-agent environments, as a result of which some reinforcement learning algorithms combined with graph neural networks have been proposed to understand the mutual interplay between agents. However, highly complicated and dynamic multi-agent environments require more ingenious graph neural networks, which can comprehensively represent not only the graph topology structure but also evolution process of the structure due to agents emerging, disappearing and moving. To tackle these difficulties, we propose Gumbel Sinkhorn graph attention reinforcement learning, where a graph attention network highly represents the underlying graph topology structure of the multi-agent environment, and can adapt to the dynamic topology structure of graph better with the help of Gumbel Sinkhorn network by learning latent permutations. Empirically, simulation results show how our proposed graph reinforcement learning methodology outperforms existing methods in the PettingZoo multi-agent environment by learning latent permutations.

Via

Access Paper or Ask Questions