Land use as contained in geospatial databases constitutes an essential input for different applica-tions such as urban management, regional planning and environmental monitoring. In this paper, a hierarchical deep learning framework is proposed to verify the land use information. For this purpose, a two-step strategy is applied. First, given high-resolution aerial images, the land cover information is determined. To achieve this, an encoder-decoder based convolutional neural net-work (CNN) is proposed. Second, the pixel-wise land cover information along with the aerial images serves as input for another CNN to classify land use. Because the object catalogue of geospatial databases is frequently constructed in a hierarchical manner, we propose a new CNN-based method aiming to predict land use in multiple levels hierarchically and simultaneously. A so called Joint Optimization (JO) is proposed where predictions are made by selecting the hier-archical tuple over all levels which has the maximum joint class scores, providing consistent results across the different levels. The conducted experiments show that the CNN relying on JO outperforms previous results, achieving an overall accuracy up to 92.5%. In addition to the individual experiments on two test sites, we investigate whether data showing different characteristics can improve the results of land cover and land use classification, when processed together. To do so, we combine the two datasets and undertake some additional experiments. The results show that adding more data helps both land cover and land use classification, especially the identification of underrepre-sented categories, despite their different characteristics.