Abstract:In autonomous driving, 3D detection provides more precise information to downstream tasks, including path planning and motion estimation, compared to 2D detection. Therefore, the need for 3D detection research has emerged. However, although single and multi-view images and depth maps obtained from the camera were used, detection accuracy was relatively low compared to other modality-based detectors due to the lack of geometric information. The proposed multi-modal 3D object detection combines semantic features obtained from images and geometric features obtained from point clouds, but there are difficulties in defining unified representation to fuse data existing in different domains and synchronization between them. In this paper, we propose SeSame : point-wise semantic feature as a new presentation to ensure sufficient semantic information of the existing LiDAR-only based 3D detection. Experiments show that our approach outperforms previous state-of-the-art at different levels of difficulty in car and performance improvement on the KITTI object detection benchmark. Our code is available at https://github.com/HAMA-DL-dev/SeSame
Abstract:This paper presents a hybrid motion planning strategy that combines a deep generative network with a conventional motion planning method. Existing planning methods such as A* and Hybrid A* are widely used in path planning tasks because of their ability to determine feasible paths even in complex environments; however, they have limitations in terms of efficiency. To overcome these limitations, a path planning algorithm based on a neural network, namely the neural Hybrid A*, is introduced. This paper proposes using a conditional variational autoencoder (CVAE) to guide the search algorithm by exploiting the ability of CVAE to learn information about the planning space given the information of the parking environment. A non-uniform expansion strategy is utilized based on a distribution of feasible trajectories learned in the demonstrations. The proposed method effectively learns the representations of a given state, and shows improvement in terms of algorithm performance.
Abstract:This paper presents online-capable deep learning model for probabilistic vehicle trajectory prediction. We propose a simple encoder-decoder architecture based on multi-head attention. The proposed model generates the distribution of the predicted trajectories for multiple vehicles in parallel. Our approach to model the interactions can learn to attend to a few influential vehicles in an unsupervised manner, which can improve the interpretability of the network. The experiments using naturalistic trajectories at highway show the clear improvement in terms of positional error on both longitudinal and lateral direction.