Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carlos Vallespi-Gonzalez

MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting through Multi-View Fusion of LiDAR Data

Apr 21, 2021

Ankit Laddha, Shivam Gautam, Stefan Palombo, Shreyash Pandey, Carlos Vallespi-Gonzalez

Figure 1 for MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting through Multi-View Fusion of LiDAR Data

Figure 2 for MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting through Multi-View Fusion of LiDAR Data

Figure 3 for MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting through Multi-View Fusion of LiDAR Data

Figure 4 for MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting through Multi-View Fusion of LiDAR Data

Abstract:In this work, we propose \textit{MVFuseNet}, a novel end-to-end method for joint object detection and motion forecasting from a temporal sequence of LiDAR data. Most existing methods operate in a single view by projecting data in either range view (RV) or bird's eye view (BEV). In contrast, we propose a method that effectively utilizes both RV and BEV for spatio-temporal feature learning as part of a temporal fusion network as well as for multi-scale feature learning in the backbone network. Further, we propose a novel sequential fusion approach that effectively utilizes multiple views in the temporal fusion network. We show the benefits of our multi-view approach for the tasks of detection and motion forecasting on two large-scale self-driving data sets, achieving state-of-the-art results. Furthermore, we show that MVFusenet scales well to large operating ranges while maintaining real-time performance.

* Published at CVPR 2021 Workshop on Autonomous Driving

Via

Access Paper or Ask Questions

Convolutions for Spatial Interaction Modeling

Apr 15, 2021

Zhaoen Su, Chao Wang, David Bradley, Carlos Vallespi-Gonzalez, Carl Wellington, Nemanja Djuric

Figure 1 for Convolutions for Spatial Interaction Modeling

Figure 2 for Convolutions for Spatial Interaction Modeling

Figure 3 for Convolutions for Spatial Interaction Modeling

Figure 4 for Convolutions for Spatial Interaction Modeling

Abstract:In many different fields interactions between objects play a critical role in determining their behavior. Graph neural networks (GNNs) have emerged as a powerful tool for modeling interactions, although often at the cost of adding considerable complexity and latency. In this paper, we consider the problem of spatial interaction modeling in the context of predicting the motion of actors around autonomous vehicles, and investigate alternative approaches to GNNs. We revisit convolutions and show that they can demonstrate comparable performance to graph networks in modeling spatial interactions with lower latency, thus providing an effective and efficient alternative in time-critical systems. Moreover, we propose a novel interaction loss to further improve the interaction modeling of the considered methods.

* Supplementary material included

Via

Access Paper or Ask Questions

Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models

Jan 09, 2021

Abhishek Mohta, Fang-Chieh Chou, Brian C. Becker, Carlos Vallespi-Gonzalez, Nemanja Djuric

Figure 1 for Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models

Figure 2 for Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models

Figure 3 for Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models

Figure 4 for Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models

Abstract:Detection of surrounding objects and their motion prediction are critical components of a self-driving system. Recently proposed models that jointly address these tasks rely on a number of sensors to achieve state-of-the-art performance. However, this increases system complexity and may result in a brittle model that overfits to any single sensor modality while ignoring others, leading to reduced generalization. We focus on this important problem and analyze the contribution of sensor modalities towards the model performance. In addition, we investigate the use of sensor dropout to mitigate the above-mentioned issues, leading to a more robust, better-performing model on real-world driving data.

Via

Access Paper or Ask Questions

Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models

Nov 05, 2020

Henggang Cui, Fang-Chieh Chou, Jake Charland, Carlos Vallespi-Gonzalez, Nemanja Djuric

Figure 1 for Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models

Figure 2 for Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models

Figure 3 for Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models

Figure 4 for Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models

Abstract:Object detection is a critical component of a self-driving system, tasked with inferring the current states of the surrounding traffic actors. While there exist a number of studies on the problem of inferring the position and shape of vehicle actors, understanding actors' orientation remains a challenge for existing state-of-the-art detectors. Orientation is an important property for downstream modules of an autonomous system, particularly relevant for motion prediction of stationary or reversing actors where current approaches struggle. We focus on this task and present a method that extends the existing models that perform joint object detection and motion prediction, allowing us to more accurately infer vehicle orientations. In addition, the approach is able to quantify prediction uncertainty, outputting the probability that the inferred orientation is flipped, which allows for improved motion prediction and safer autonomous operations. Empirical results show the benefits of the approach, obtaining state-of-the-art performance on the open-sourced nuScenes data set.

Via

Access Paper or Ask Questions

Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Nov 01, 2020

Zhaoen Su, Chao Wang, Henggang Cui, Nemanja Djuric, Carlos Vallespi-Gonzalez, David Bradley

Figure 1 for Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Figure 2 for Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Figure 3 for Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Figure 4 for Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Abstract:A commonly-used representation for motion prediction of actors is a sequence of waypoints (comprising positions and orientations) for each actor at discrete future time-points. While this approach is simple and flexible, it can exhibit unrealistic higher-order derivatives (such as acceleration) and approximation errors at intermediate time steps. To address this issue we propose a simple and general representation for temporally continuous probabilistic trajectory prediction that is based on polynomial trajectory parameterization. We evaluate the proposed representation on supervised trajectory prediction tasks using two large self-driving data sets. The results show realistic higher-order derivatives and better accuracy at interpolated time-points, as well as the benefits of the inferred noise distributions over the trajectories. Extensive experimental studies based on existing state-of-the-art models demonstrate the effectiveness of the proposed approach relative to other representations in predicting the future motions of vehicle, bicyclist, and pedestrian traffic actors.

Via

Access Paper or Ask Questions

LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion

Oct 15, 2020

Meet Shah, Zhiling Huang, Ankit Laddha, Matthew Langford, Blake Barber, Sidney Zhang, Carlos Vallespi-Gonzalez, Raquel Urtasun

Figure 1 for LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion

Figure 2 for LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion

Figure 3 for LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion

Figure 4 for LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion

Abstract:In this paper, we present LiRaNet, a novel end-to-end trajectory prediction method which utilizes radar sensor information along with widely used lidar and high definition (HD) maps. Automotive radar provides rich, complementary information, allowing for longer range vehicle detection as well as instantaneous radial velocity measurements. However, there are factors that make the fusion of lidar and radar information challenging, such as the relatively low angular resolution of radar measurements, their sparsity and the lack of exact time synchronization with lidar. To overcome these challenges, we propose an efficient spatio-temporal radar feature extraction scheme which achieves state-of-the-art performance on multiple large-scale datasets.Further, by incorporating radar information, we show a 52% reduction in prediction error for objects with high acceleration and a 16% reduction in prediction error for objects at longer range.

* Conference on Robot Learning (CoRL) 2020

Via

Access Paper or Ask Questions

Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Aug 27, 2020

Sudeep Fadadu, Shreyash Pandey, Darshan Hegde, Yi Shi, Fang-Chieh Chou, Nemanja Djuric, Carlos Vallespi-Gonzalez

Figure 1 for Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Figure 2 for Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Figure 3 for Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Figure 4 for Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Abstract:We present an end-to-end method for object detection and trajectory prediction utilizing multi-view representations of LiDAR returns. Our method builds on a state-of-the-art Bird's-Eye View (BEV) network that fuses voxelized features from a sequence of historical LiDAR data as well as rasterized high-definition map to perform detection and prediction tasks. We extend the BEV network with additional LiDAR Range-View (RV) features that use the raw LiDAR information in its native, non-quantized representation. The RV feature map is projected into BEV and fused with the BEV features computed from LiDAR and high-definition map. The fused features are then further processed to output the final detections and trajectories, within a single end-to-end trainable network. In addition, using this framework the RV fusion of LiDAR and camera is performed in a straightforward and computational efficient manner. The proposed approach improves the state-of-the-art on proprietary large-scale real-world data collected by a fleet of self-driving vehicles, as well as on the public nuScenes data set.

Via

Access Paper or Ask Questions

MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Jun 10, 2020

Nemanja Djuric, Henggang Cui, Zhaoen Su, Shangxuan Wu, Huahua Wang, Fang-Chieh Chou, Luisa San Martin, Song Feng, Rui Hu, Yang Xu(+6 more)

Figure 1 for MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Figure 2 for MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Figure 3 for MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Figure 4 for MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Abstract:One of the critical pieces of the self-driving puzzle is understanding the surroundings of the self-driving vehicle (SDV) and predicting how these surroundings will change in the near future. To address this task we propose MultiXNet, an end-to-end approach for detection and motion prediction based directly on lidar sensor data. This approach builds on prior work by handling multiple classes of traffic actors, adding a jointly trained second-stage trajectory refinement step, and producing a multimodal probability distribution over future actor motion that includes both multiple discrete traffic behaviors and calibrated continuous uncertainties. The method was evaluated on a large-scale, real-world data set collected by a fleet of SDVs in several cities, with the results indicating that it outperforms existing state-of-the-art approaches.

Via

Access Paper or Ask Questions

RV-FuseNet: Range View based Fusion of Time-Series LiDAR Data for Joint 3D Object Detection and Motion Forecasting

May 21, 2020

Ankit Laddha, Shivam Gautam, Gregory P. Meyer, Carlos Vallespi-Gonzalez

Figure 1 for RV-FuseNet: Range View based Fusion of Time-Series LiDAR Data for Joint 3D Object Detection and Motion Forecasting

Figure 2 for RV-FuseNet: Range View based Fusion of Time-Series LiDAR Data for Joint 3D Object Detection and Motion Forecasting

Figure 3 for RV-FuseNet: Range View based Fusion of Time-Series LiDAR Data for Joint 3D Object Detection and Motion Forecasting

Figure 4 for RV-FuseNet: Range View based Fusion of Time-Series LiDAR Data for Joint 3D Object Detection and Motion Forecasting

Abstract:Autonomous vehicles rely on robust real-time detection and future motion prediction of traffic participants to safely navigate urban environments. We present a novel end-to-end approach that uses raw time-series LiDAR data to jointly solve both detection and prediction. We use the range view representation of LiDAR instead of voxelization since it does not discard information and is more efficient due to its compactness. However, for time-series fusion the data needs to be projected to a common viewpoint, and often this viewpoint is different from where it was captured leading to distortions. These distortions have an adverse impact on performance. Thus, we propose a novel architecture which reduces the impact of distortions by sequentially projecting each sweep into the viewpoint of the next sweep in time. We demonstrate that our sequential fusion approach is superior to methods that directly project all the data into the most recent viewpoint. Furthermore, we compare our approach to existing state-of-the art methods on multiple autonomous driving datasets and show competitive results.

* In submission to BMVC 2020

Via

Access Paper or Ask Questions

LaserFlow: Efficient and Probabilistic Object Detection and Motion Forecasting

Apr 21, 2020

Gregory P. Meyer, Jake Charland, Shreyash Pandey, Ankit Laddha, Carlos Vallespi-Gonzalez, Carl K. Wellington

Figure 1 for LaserFlow: Efficient and Probabilistic Object Detection and Motion Forecasting

Figure 2 for LaserFlow: Efficient and Probabilistic Object Detection and Motion Forecasting

Figure 3 for LaserFlow: Efficient and Probabilistic Object Detection and Motion Forecasting

Figure 4 for LaserFlow: Efficient and Probabilistic Object Detection and Motion Forecasting

Abstract:In this work, we present LaserFlow, an efficient method for 3D object detection and motion forecasting from LiDAR. Unlike the previous work, our approach utilizes the native range view representation of the LiDAR, which enables our method to operate at the full range of the sensor in real-time without voxelization or compression of the data. We propose a new multi-sweep fusion architecture, which extracts and merges temporal features directly from the range images. Furthermore, we propose a novel technique for learning a probability distribution over future trajectories inspired by curriculum learning. We evaluate LaserFlow on two autonomous driving datasets and demonstrate competitive results when compared to the existing state-of-the-art methods.

Via

Access Paper or Ask Questions