Deep Learning has gained immense success in pushing today's artificial intelligence forward. To solve the challenge of limited labeled data in the supervised learning world, unsupervised learning has been proposed years ago while low accuracy hinters its realistic applications. Generative adversarial network (GAN) emerges as an unsupervised learning approach with promising accuracy and are under extensively study. However, the execution of GAN is extremely memory and computation intensive and results in ultra-low speed and high-power consumption. In this work, we proposed a holistic solution for fast and energy-efficient GAN computation through a memristor-based neuromorphic system. First, we exploited a hardware and software co-design approach to map the computation blocks in GAN efficiently. We also proposed an efficient data flow for optimal parallelism training and testing, depending on the computation correlations between different computing blocks. To compute the unique and complex loss of GAN, we developed a diff-block with optimized accuracy and performance. The experiment results on big data show that our design achieves 2.8x speedup and 6.1x energy-saving compared with the traditional GPU accelerator, as well as 5.5x speedup and 1.4x energy-saving compared with the previous FPGA-based accelerator.