We present a novel artificial cognitive mapping system using generative deep neural networks (VAE/GAN), which can map input images to latent vectors and generate temporal sequences internally. The results show that the distance of the predicted image is reflected in the distance of the corresponding latent vector after training. This indicates that the latent space is constructed to reflect the proximity structure of the data set, and may provide a mechanism by which many aspects of cognition are spatially represented. The present study allows the network to internally generate temporal sequences analogous to hippocampal replay/pre-play, where VAE produces only near-accurate replays of past experiences, but by introducing GANs, latent vectors of temporally close images are closely aligned and sequence acquired some instability. This may be the origin of the generation of the new sequences found in the hippocampus.