Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arnt-Børre Salberg

DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models

Jun 13, 2025

Muhammad Sarmad, Arnt-Børre Salberg, Michael Kampffmeyer

Abstract:This paper presents DiffFuSR, a modular pipeline for super-resolving all 12 spectral bands of Sentinel-2 Level-2A imagery to a unified ground sampling distance (GSD) of 2.5 meters. The pipeline comprises two stages: (i) a diffusion-based super-resolution (SR) model trained on high-resolution RGB imagery from the NAIP and WorldStrat datasets, harmonized to simulate Sentinel-2 characteristics; and (ii) a learned fusion network that upscales the remaining multispectral bands using the super-resolved RGB image as a spatial prior. We introduce a robust degradation model and contrastive degradation encoder to support blind SR. Extensive evaluations of the proposed SR pipeline on the OpenSR benchmark demonstrate that the proposed method outperforms current SOTA baselines in terms of reflectance fidelity, spectral consistency, spatial alignment, and hallucination suppression. Furthermore, the fusion network significantly outperforms classical pansharpening approaches, enabling accurate enhancement of Sentinel-2's 20 m and 60 m bands. This study underscores the power of harmonized learning with generative priors and fusion strategies to create a modular framework for Sentinel-2 SR. Our code and models can be found at https://github.com/NorskRegnesentral/DiffFuSR.

* preprint under review

Via

Access Paper or Ask Questions

Uncertainties of Satellite-based Essential Climate Variables from Deep Learning

Dec 23, 2024

Junyang Gou, Arnt-Børre Salberg, Mostafa Kiani Shahvandi, Mohammad J. Tourian, Ulrich Meyer, Eva Boergens, Anders U. Waldeland, Isabella Velicogna, Fredrik Dahl, Adrian Jäggi(+2 more)

Abstract:Accurate uncertainty information associated with essential climate variables (ECVs) is crucial for reliable climate modeling and understanding the spatiotemporal evolution of the Earth system. In recent years, geoscience and climate scientists have benefited from rapid progress in deep learning to advance the estimation of ECV products with improved accuracy. However, the quantification of uncertainties associated with the output of such deep learning models has yet to be thoroughly adopted. This survey explores the types of uncertainties associated with ECVs estimated from deep learning and the techniques to quantify them. The focus is on highlighting the importance of quantifying uncertainties inherent in ECV estimates, considering the dynamic and multifaceted nature of climate data. The survey starts by clarifying the definition of aleatoric and epistemic uncertainties and their roles in a typical satellite observation processing workflow, followed by bridging the gap between conventional statistical and deep learning views on uncertainties. Then, we comprehensively review the existing techniques for quantifying uncertainties associated with deep learning algorithms, focusing on their application in ECV studies. The specific need for modification to fit the requirements from both the Earth observation side and the deep learning side in such interdisciplinary tasks is discussed. Finally, we demonstrate our findings with two ECV examples, snow cover and terrestrial water storage, and provide our perspectives for future research.

Via

Access Paper or Ask Questions

Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

Nov 06, 2021

Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

Figure 1 for Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

Figure 2 for Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

Figure 3 for Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

Figure 4 for Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

Abstract:Multi-modality data is becoming readily available in remote sensing (RS) and can provide complementary information about the Earth's surface. Effective fusion of multi-modal information is thus important for various applications in RS, but also very challenging due to large domain differences, noise, and redundancies. There is a lack of effective and scalable fusion techniques for bridging multiple modality encoders and fully exploiting complementary information. To this end, we propose a new multi-modality network (MultiModNet) for land cover mapping of multi-modal remote sensing data based on a novel pyramid attention fusion (PAF) module and a gated fusion unit (GFU). The PAF module is designed to efficiently obtain rich fine-grained contextual representations from each modality with a built-in cross-level and cross-view attention fusion mechanism, and the GFU module utilizes a novel gating mechanism for early merging of features, thereby diminishing hidden redundancies and noise. This enables supplementary modalities to effectively extract the most valuable and complementary information for late feature fusion. Extensive experiments on two representative RS benchmark datasets demonstrate the effectiveness, robustness, and superiority of the MultiModNet for multi-modal land cover classification.

* 24 pages, 11 figures, submitted to IJRS

Via

Access Paper or Ask Questions

SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation

Sep 03, 2020

Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

Figure 1 for SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation

Figure 2 for SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation

Figure 3 for SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation

Figure 4 for SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation

Abstract:Capturing global contextual representations by exploiting long-range pixel-pixel dependencies has shown to improve semantic segmentation performance. However, how to do this efficiently is an open question as current approaches of utilising attention schemes or very deep models to increase the models field of view, result in complex models with large memory consumption. Inspired by recent work on graph neural networks, we propose the Self-Constructing Graph (SCG) module that learns a long-range dependency graph directly from the image and uses it to propagate contextual information efficiently to improve semantic segmentation. The module is optimised via a novel adaptive diagonal enhancement method and a variational lower bound that consists of a customized graph reconstruction term and a Kullback-Leibler divergence regularization term. When incorporated into a neural network (SCG-Net), semantic segmentation is performed in an end-to-end manner and competitive performance (mean F1-scores of 92.0% and 89.8% respectively) on the publicly available ISPRS Potsdam and Vaihingen datasets is achieved, with much fewer parameters, and at a lower computational cost compared to related pure convolutional neural network (CNN) based models.

* 11 pages, 5 figs. Draf version to TGRS, code will be open soon

Via

Access Paper or Ask Questions

Self-Constructing Graph Convolutional Networks for Semantic Labeling

Apr 23, 2020

Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

Figure 1 for Self-Constructing Graph Convolutional Networks for Semantic Labeling

Figure 2 for Self-Constructing Graph Convolutional Networks for Semantic Labeling

Figure 3 for Self-Constructing Graph Convolutional Networks for Semantic Labeling

Figure 4 for Self-Constructing Graph Convolutional Networks for Semantic Labeling

Abstract:Graph Neural Networks (GNNs) have received increasing attention in many fields. However, due to the lack of prior graphs, their use for semantic labeling has been limited. Here, we propose a novel architecture called the Self-Constructing Graph (SCG), which makes use of learnable latent variables to generate embeddings and to self-construct the underlying graphs directly from the input features without relying on manually built prior knowledge graphs. SCG can automatically obtain optimized non-local context graphs from complex-shaped objects in aerial imagery. We optimize SCG via an adaptive diagonal enhancement method and a variational lower bound that consists of a customized graph reconstruction term and a Kullback-Leibler divergence regularization term. We demonstrate the effectiveness and flexibility of the proposed SCG on the publicly available ISPRS Vaihingen dataset and our model SCG-Net achieves competitive results in terms of F1-score with much fewer parameters and at a lower computational cost compared to related pure-CNN based work. Our code will be made public soon.

* IGARSS-2020, code at: github.com/samleoqh/MSCG-Net

Via

Access Paper or Ask Questions

Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

Apr 21, 2020

Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

Figure 1 for Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

Figure 2 for Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

Figure 3 for Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

Figure 4 for Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

Abstract:We propose a novel architecture called the Multi-view Self-Constructing Graph Convolutional Networks (MSCG-Net) for semantic segmentation. Building on the recently proposed Self-Constructing Graph (SCG) module, which makes use of learnable latent variables to self-construct the underlying graphs directly from the input features without relying on manually built prior knowledge graphs, we leverage multiple views in order to explicitly exploit the rotational invariance in airborne images. We further develop an adaptive class weighting loss to address the class imbalance. We demonstrate the effectiveness and flexibility of the proposed method on the Agriculture-Vision challenge dataset and our model achieves very competitive results (0.547 mIoU) with much fewer parameters and at a lower computational cost compared to related pure-CNN based work. Code will be available at: github.com/samleoqh/MSCG-Net

* 7-page, MSCG-Net, CVPRW-2020

Via

Access Paper or Ask Questions

Dense Dilated Convolutions Merging Network for Land Cover Classification

Mar 09, 2020

Qinghui Liu, Michael Kampffmeyer, Robert Jessen, Arnt-Børre Salberg

Figure 1 for Dense Dilated Convolutions Merging Network for Land Cover Classification

Figure 2 for Dense Dilated Convolutions Merging Network for Land Cover Classification

Figure 3 for Dense Dilated Convolutions Merging Network for Land Cover Classification

Figure 4 for Dense Dilated Convolutions Merging Network for Land Cover Classification

Abstract:Land cover classification of remote sensing images is a challenging task due to limited amounts of annotated data, highly imbalanced classes, frequent incorrect pixel-level annotations, and an inherent complexity in the semantic segmentation task. In this article, we propose a novel architecture called the dense dilated convolutions' merging network (DDCM-Net) to address this task. The proposed DDCM-Net consists of dense dilated image convolutions merged with varying dilation rates. This effectively utilizes rich combinations of dilated convolutions that enlarge the network's receptive fields with fewer parameters and features compared with the state-of-the-art approaches in the remote sensing domain. Importantly, DDCM-Net obtains fused local- and global-context information, in effect incorporating surrounding discriminative capability for multiscale and complex-shaped objects with similar color and textures in very high-resolution aerial imagery. We demonstrate the effectiveness, robustness, and flexibility of the proposed DDCM-Net on the publicly available ISPRS Potsdam and Vaihingen data sets, as well as the DeepGlobe land cover data set. Our single model, trained on three-band Potsdam and Vaihingen data sets, achieves better accuracy in terms of both mean intersection over union (mIoU) and F1-score compared with other published models trained with more than three-band data. We further validate our model on the DeepGlobe data set, achieving state-of-the-art result 56.2% mIoU with much fewer parameters and at a lower computational cost compared with related recent work. Code available at https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorch

* Semantic Segmentation, 12 pages, TGRS-2020 early access in IEEE Transactions on Geoscience and Remote Sensing. 2020, Code available at https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorch

Via

Access Paper or Ask Questions

Road Mapping In LiDAR Images Using A Joint-Task Dense Dilated Convolutions Merging Network

Sep 07, 2019

Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

Figure 1 for Road Mapping In LiDAR Images Using A Joint-Task Dense Dilated Convolutions Merging Network

Figure 2 for Road Mapping In LiDAR Images Using A Joint-Task Dense Dilated Convolutions Merging Network

Figure 3 for Road Mapping In LiDAR Images Using A Joint-Task Dense Dilated Convolutions Merging Network

Figure 4 for Road Mapping In LiDAR Images Using A Joint-Task Dense Dilated Convolutions Merging Network

Abstract:It is important, but challenging, for the forest industry to accurately map roads which are used for timber transport by trucks. In this work, we propose a Dense Dilated Convolutions Merging Network (DDCM-Net) to detect these roads in lidar images. The DDCM-Net can effectively recognize multi-scale and complex shaped roads with similar texture and colors, and also is shown to have superior performance over existing methods. To further improve its ability to accurately infer categories of roads, we propose the use of a joint-task learning strategy that utilizes two auxiliary output branches, i.e, multi-class classification and binary segmentation, joined with the main output of full-class segmentation. This pushes the network towards learning more robust representations that are expected to boost the ultimate performance of the main task. In addition, we introduce an iterative-random-weighting method to automatically weigh the joint losses for auxiliary tasks. This can avoid the difficult and expensive process of tuning the weights of each task's loss by hand. The experiments demonstrate that our proposed joint-task DDCM-Net can achieve better performance with fewer parameters and higher computational efficiency than previous state-of-the-art approaches.

* IGARSS 2019. arXiv admin note: text overlap with arXiv:1908.11799

Via

Access Paper or Ask Questions

Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images

Aug 30, 2019

Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

Figure 1 for Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images

Figure 2 for Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images

Figure 3 for Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images

Figure 4 for Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images

Abstract:We propose a network for semantic mapping called the Dense Dilated Convolutions Merging Network (DDCM-Net) to provide a deep learning approach that can recognize multi-scale and complex shaped objects with similar color and textures, such as buildings, surfaces/roads, and trees in very high resolution remote sensing images. The proposed DDCM-Net consists of dense dilated convolutions merged with varying dilation rates. This can effectively enlarge the kernels' receptive fields, and, more importantly, obtain fused local and global context information to promote surrounding discriminative capability. We demonstrate the effectiveness of the proposed DDCM-Net on the publicly available ISPRS Potsdam dataset and achieve a performance of 92.3% F1-score and 86.0% mean intersection over union accuracy by only using the RGB bands, without any post-processing. We also show results on the ISPRS Vaihingen dataset, where the DDCM-Net trained with IRRG bands, also obtained better mapping accuracy (89.8% F1-score) than previous state-of-the-art approaches.

* JURSE 2019

Via

Access Paper or Ask Questions

Deep Divergence-Based Approach to Clustering

Feb 13, 2019

Michael Kampffmeyer, Sigurd Løkse, Filippo M. Bianchi, Lorenzo Livi, Arnt-Børre Salberg, Robert Jenssen

Figure 1 for Deep Divergence-Based Approach to Clustering

Figure 2 for Deep Divergence-Based Approach to Clustering

Figure 3 for Deep Divergence-Based Approach to Clustering

Figure 4 for Deep Divergence-Based Approach to Clustering

Abstract:A promising direction in deep learning research consists in learning representations and simultaneously discovering cluster structure in unlabeled data by optimizing a discriminative loss function. As opposed to supervised deep learning, this line of research is in its infancy, and how to design and optimize suitable loss functions to train deep neural networks for clustering is still an open question. Our contribution to this emerging field is a new deep clustering network that leverages the discriminative power of information-theoretic divergence measures, which have been shown to be effective in traditional clustering. We propose a novel loss function that incorporates geometric regularization constraints, thus avoiding degenerate structures of the resulting clustering partition. Experiments on synthetic benchmarks and real datasets show that the proposed network achieves competitive performance with respect to other state-of-the-art methods, scales well to large datasets, and does not require pre-training steps.

Via

Access Paper or Ask Questions