Abstract:We propose Mask CycleGAN, a novel architecture for unpaired image domain translation built based on CycleGAN, with an aim to address two issues: 1) unimodality in image translation and 2) lack of interpretability of latent variables. Our innovation in the technical approach is comprised of three key components: masking scheme, generator and objective. Experimental results demonstrate that this architecture is capable of bringing variations to generated images in a controllable manner and is reasonably robust to different masks.
Abstract:Predicting future trajectories of road agents is a critical task for autonomous driving. Recent goal-based trajectory prediction methods, such as DenseTNT and PECNet, have shown good performance on prediction tasks on public datasets. However, they usually require complicated goal-selection algorithms and optimization. In this work, we propose KEMP, a hierarchical end-to-end deep learning framework for trajectory prediction. At the core of our framework is keyframe-based trajectory prediction, where keyframes are representative states that trace out the general direction of the trajectory. KEMP first predicts keyframes conditioned on the road context, and then fills in intermediate states conditioned on the keyframes and the road context. Under our general framework, goal-conditioned methods are special cases in which the number of keyframes equal to one. Unlike goal-conditioned methods, our keyframe predictor is learned automatically and does not require hand-crafted goal-selection algorithms. We evaluate our model on public benchmarks and our model ranked 1st on Waymo Open Motion Dataset Leaderboard (as of September 1, 2021).