Abstract:Pedestrian trajectory prediction is an active research area with recent works undertaken to embed accurate models of pedestrians social interactions and their contextual compliance into dynamic spatial graphs. However, existing works rely on spatial assumptions about the scene and dynamics, which entails a significant challenge to adapt the graph structure in unknown environments for an online system. In addition, there is a lack of assessment approach for the relational modeling impact on prediction performance. To fill this gap, we propose Social Trajectory Recommender-Gated Graph Recurrent Neighborhood Network, (STR-GGRNN), which uses data-driven adaptive online neighborhood recommendation based on the contextual scene features and pedestrian visual cues. The neighborhood recommendation is achieved by online Nonnegative Matrix Factorization (NMF) to construct the graph adjacency matrices for predicting the pedestrians' trajectories. Experiments based on widely-used datasets show that our method outperforms the state-of-the-art. Our best performing model achieves 12 cm ADE and $\sim$15 cm FDE on ETH-UCY dataset. The proposed method takes only 0.49 seconds when sampling a total of 20K future trajectories per frame.
Abstract:Pedestrian trajectory prediction is a prominent research track that has advanced towards modelling of crowd social and contextual interactions, with extensive usage of Long Short-Term Memory (LSTM) for temporal representation of walking trajectories. Existing approaches use virtual neighborhoods as a fixed grid for pooling social states of pedestrians with tuning process that controls how social interactions are being captured. This entails performance customization to specific scenes but lowers the generalization capability of the approaches. In our work, we deploy \textit{Grid-LSTM}, a recent extension of LSTM, which operates over multidimensional feature inputs. We present a new perspective to interaction modeling by proposing that pedestrian neighborhoods can become adaptive in design. We use \textit{Grid-LSTM} as an encoder to learn about potential future neighborhoods and their influence on pedestrian motion given the visual and the spatial boundaries. Our model outperforms state-of-the-art approaches that collate resembling features over several publicly-tested surveillance videos. The experiment results clearly illustrate the generalization of our approach across datasets that varies in scene features and crowd dynamics.
Abstract:Pedestrian trajectory prediction is essential for collision avoidance in autonomous driving and robot navigation. However, predicting a pedestrian's trajectory in crowded environments is non-trivial as it is influenced by other pedestrians' motion and static structures that are present in the scene. Such human-human and human-space interactions lead to non-linearities in the trajectories. In this paper, we present a new spatio-temporal graph based Long Short-Term Memory (LSTM) network for predicting pedestrian trajectory in crowded environments, which takes into account the interaction with static (physical objects) and dynamic (other pedestrians) elements in the scene. Our results are based on two widely-used datasets to demonstrate that the proposed method outperforms the state-of-the-art approaches in human trajectory prediction. In particular, our method leads to a reduction in Average Displacement Error (ADE) and Final Displacement Error (FDE) of up to 55% and 61% respectively over state-of-the-art approaches.