Abstract:A stable 3D object detection model based on BEV paradigm with temporal information is very important for autonomous driving systems. However, current temporal fusion model use convolutional layer or deformable self-attention is not conducive to the exchange of global information of BEV space and has more computational cost. Recently, a newly proposed based model specialized in processing sequence called mamba has shown great potential in multiple downstream task. In this work, we proposed a mamba2-based BEV 3D object detection model named MambaBEV. We also adapt an end to end self driving paradigm to test the performance of the model. Our work performs pretty good results on nucences datasets:Our base version achieves 51.7% NDS. Our code will be available soon.
Abstract:Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance. It has become a popular paradigm when designing networks for real-time practical autonomous driving system, where computation resources are limited. This paper proposed an effective and efficient multi-task learning network to simultaneously perform the task of traffic object detection, drivable road area segmentation and lane detection. Our model achieved the new state-of-the-art (SOTA) performance in terms of accuracy and speed on the challenging BDD100K dataset. Especially, the inference time is reduced by half compared to the previous SOTA model. Code will be released in the near future.