Abstract:Estimating the construction year of buildings is of great importance for sustainability. Sustainable buildings minimize energy consumption and are a key part of responsible and sustainable urban planning and development to effectively combat climate change. By using Artificial Intelligence (AI) and recently proposed Transformer models, we are able to estimate the construction epoch of buildings from a multi-modal dataset. In this paper, we introduce a new benchmark multi-modal dataset, i.e. the Map your City Dataset (MyCD), containing top-view Very High Resolution (VHR) images, Earth Observation (EO) multi-spectral data from the Copernicus Sentinel-2 satellite constellation, and street-view images in many different cities in Europe, co-localized with respect to the building under study and labelled with the construction epoch. We assess EO generalization performance on new/ previously unseen cities that have been held-out from training and appear only during inference. In this work, we present the community-based data challenge we organized based on MyCD. The ESA AI4EO Challenge MapYourCity was opened in 2024 for 4 months. Here, we present the Top-4 performing models, and the main evaluation results. During inference, the performance of the models using both all three input modalities and only the two top-view modalities, i.e. without the street-view images, is examined. The evaluation results show that the models are effective and can achieve good performance on this difficult real-world task of estimating the age of buildings, even on previously unseen cities, as well as even using only the two top-view modalities (i.e. VHR and Sentinel-2) during inference.
Abstract:Human settlements are the cause and consequence of most environmental and societal changes on Earth; however, their location and extent is still under debate. We provide here a new 10m resolution (0.32 arc sec) global map of human settlements on Earth for the year 2015, namely the World Settlement Footprint 2015 (WSF2015). The raster dataset has been generated by means of an advanced classification system which, for the first time, jointly exploits open-and-free optical and radar satellite imagery. The WSF2015 has been validated against 900,000 samples labelled by crowdsourcing photointerpretation of very high resolution Google Earth imagery and outperforms all other similar existing layers; in particular, it considerably improves the detection of very small settlements in rural regions and better outlines scattered suburban areas. The dataset can be used at any scale of observation in support to all applications requiring detailed and accurate information on human presence (e.g., socioeconomic development, population distribution, risks assessment, etc.).