Spatio-temporal (ST) data for urban applications, such as taxi demand, traffic flow, regional rainfall is inherently stochastic and unpredictable. Recently, deep learning based ST prediction models are proposed to learn the ST characteristics of data. However, it is still very challenging (1) to adequately learn the complex and non-linear ST relationships; (2) to model the high variations in the ST data volumes as it is inherently dynamic, changing over time (i.e., irregular) and highly influenced by many external factors, such as adverse weather, accidents, traffic control, PoI, etc.; and (3) as there can be many complicated external factors that can affect the accuracy and it is impossible to list them explicitly. To handle the aforementioned issues, in this paper, we propose a novel deep generative adversarial network based model (named, D-GAN) for more accurate ST prediction by implicitly learning ST feature representations in an unsupervised manner. D-GAN adopts a GAN-based structure and jointly learns generation and variational inference of data. More specifically, D-GAN consists of two major parts: (1) a deep ST feature learning network to model the ST correlations and semantic variations, and underlying factors of variations and irregularity in the data through the implicit distribution modelling; (2) a fusion module to incorporate external factors for reaching a better inference. To the best our knowledge, no prior work studies ST prediction problem via deep implicit generative model and in an unsupervised manner. Extensive experiments performed on two real-world datasets show that D-GAN achieves more accurate results than traditional as well as deep learning based ST prediction methods.