Abstract:Background: The mapping of tree species within Norwegian forests is a time-consuming process, involving forest associations relying on manual labeling by experts. The process can involve both aerial imagery, personal familiarity, or on-scene references, and remote sensing data. The state-of-the-art methods usually use high resolution aerial imagery with semantic segmentation methods. Methods: We present a deep learning based tree species classification model utilizing only lidar (Light Detection And Ranging) data. The lidar images are segmented into four classes (Norway Spruce, Scots Pine, Birch, background) with a U-Net based network. The model is trained with focal loss over partial weak labels. A major benefit of the approach is that both the lidar imagery and the base map for the labels have free and open access. Results: Our tree species classification model achieves a macro-averaged F1 score of 0.70 on an independent validation with National Forest Inventory (NFI) in-situ sample plots. That is close to, but below the performance of aerial, or aerial and lidar combined models.
Abstract:Estimating building footprint maps from geospatial data is of paramount importance in urban planning, development, disaster management, and various other applications. Deep learning methodologies have gained prominence in building segmentation maps, offering the promise of precise footprint extraction without extensive post-processing. However, these methods face challenges in generalization and label efficiency, particularly in remote sensing, where obtaining accurate labels can be both expensive and time-consuming. To address these challenges, we propose terrain-aware self-supervised learning, tailored to remote sensing, using digital elevation models from LiDAR data. We propose to learn a model to differentiate between bare Earth and superimposed structures enabling the network to implicitly learn domain-relevant features without the need for extensive pixel-level annotations. We test the effectiveness of our approach by evaluating building segmentation performance on test datasets with varying label fractions. Remarkably, with only 1% of the labels (equivalent to 25 labeled examples), our method improves over ImageNet pre-training, showing the advantage of leveraging unlabeled data for feature extraction in the domain of remote sensing. The performance improvement is more pronounced in few-shot scenarios and gradually closes the gap with ImageNet pre-training as the label fraction increases. We test on a dataset characterized by substantial distribution shifts and labeling errors to demonstrate the generalizability of our approach. When compared to other baselines, including ImageNet pretraining and more complex architectures, our approach consistently performs better, demonstrating the efficiency and effectiveness of self-supervised terrain-aware feature learning.