Earth's physical properties like atmosphere, topography and ground instability can be determined by differencing billions of phase measurements (pixels) in subsequent matching Interferometric Synthetic Aperture Radar (InSAR) images. Quality (coherence) of each pixel can vary from perfect information (1) to complete noise (0), which needs to be quantified, alongside filtering information-bearing pixels. Phase filtering is thus critical to InSAR's Digital Elevation Model (DEM) production pipeline, as it removes spatial inconsistencies (residues), immensely improving the subsequent unwrapping. Recent explosion in quantity of available InSAR data can facilitate Wide Area Monitoring (WAM) over several geographical regions, if effective and efficient automated processing can obviate manual quality-control. Advances in parallel computing architectures and Convolutional Neural Networks (CNNs) which thrive on them to rival human performance on visual pattern recognition makes this approach ideal for InSAR phase filtering for WAM, but remains largely unexplored. We propose "GenInSAR", a CNN-based generative model for joint phase filtering and coherence estimation. We use satellite and simulated InSAR images to show overall superior performance of GenInSAR over five algorithms qualitatively, and quantitatively using Phase and Coherence Root-Mean-Squared-Error, Residue Reduction Percentage, and Phase Cosine Error.