Abstract:In highway scenarios, an alert human driver will typically anticipate early cut-in/cut-out maneuvers of surrounding vehicles using visual cues mainly. Autonomous vehicles must anticipate these situations at an early stage too, to increase their safety and efficiency. In this work, lane-change recognition and prediction tasks are posed as video action recognition problems. Up to four different two-stream-based approaches, that have been successfully applied to address human action recognition, are adapted here by stacking visual cues from forward-looking video cameras to recognize and anticipate lane-changes of target vehicles. We study the influence of context and observation horizons on performance, and different prediction horizons are analyzed. The different models are trained and evaluated using the PREVENTION dataset. The obtained results clearly demonstrate the potential of these methodologies to serve as robust predictors of future lane-changes of surrounding vehicles proving an accuracy higher than 90% in time horizons of between 1-2 seconds.
Abstract:In highway scenarios, an alert human driver will typically anticipate early cut-in and cut-out maneuvers of surrounding vehicles using only visual cues. An automated system must anticipate these situations at an early stage too, to increase the safety and the efficiency of its performance. To deal with lane-change recognition and prediction of surrounding vehicles, we pose the problem as an action recognition/prediction problem by stacking visual cues from video cameras. Two video action recognition approaches are analyzed: two-stream convolutional networks and spatiotemporal multiplier networks. Different sizes of the regions around the vehicles are analyzed, evaluating the importance of the interaction between vehicles and the context information in the performance. In addition, different prediction horizons are evaluated. The obtained results demonstrate the potential of these methodologies to serve as robust predictors of future lane-changes of surrounding vehicles in time horizons between 1 and 2 seconds.