Multi-baseline interferometric synthetic aperture radar (InSAR) techniques are effective approaches for retrieving the 3-D information of urban areas. In order to obtain a plausible reconstruction, it is necessary to use large-stack interferograms. Hence, these methods are commonly not appropriate for large-scale 3-D urban mapping using TanDEM-X data where only a few acquisitions are available in average for each city. This work proposes a new SAR tomographic processing framework to work with those extremely small stacks, which integrates the non-local filtering into SAR tomography inversion. The applicability of the algorithm is demonstrated using a TanDEM-X multi-baseline stack with 5 bistatic interferograms over the whole city of Munich, Germany. Systematic comparison of our result with airborne LiDAR data shows that the relative height accuracy of two third buildings is within two meters, which outperforms the TanDEM-X raw DEM. The promising performance of the proposed algorithm paved the first step towards high quality large-scale 3-D urban mapping.