Finding sustainable and novel solutions to predict city-wide mobility behaviour is an ever-growing problem given increased urban complexity and growing populations. This paper seeks to address this by describing a traffic frame prediction approach that uses Convolutional LSTMs to create a Temporal Autoencoder with U-Net style skip-connections that marry together recurrent and traditional computer vision techniques to capture spatio-temporal dependencies at different scales without losing topological details of a given city. Utilisation of Cyclical Learning Rates is also presented, improving training efficiency by achieving lower loss scores in fewer epochs than standard approaches.