Abstract:Floods are among the most common and devastating natural hazards, imposing immense costs on our society and economy due to their disastrous consequences. Recent progress in weather prediction and spaceborne flood mapping demonstrated the feasibility of anticipating extreme events and reliably detecting their catastrophic effects afterwards. However, these efforts are rarely linked to one another and there is a critical lack of datasets and benchmarks to enable the direct forecasting of flood extent. To resolve this issue, we curate a novel dataset enabling a timely prediction of flood extent. Furthermore, we provide a representative evaluation of state-of-the-art methods, structured into two benchmark tracks for forecasting flood inundation maps i) in general and ii) focused on coastal regions. Altogether, our dataset and benchmark provide a comprehensive platform for evaluating flood forecasts, enabling future solutions for this critical challenge. Data, code & models are shared at https://github.com/Multihuntr/GFF under a CC0 license.
Abstract:Hurricanes and coastal floods are among the most disastrous natural hazards. Both are intimately related to storm surges, as their causes and effects, respectively. However, the short-term forecasting of storm surges has proven challenging, especially when targeting previously unseen locations or sites without tidal gauges. Furthermore, recent work improved short and medium-term weather forecasting but the handling of raw unassimilated data remains non-trivial. In this paper, we tackle both challenges and demonstrate that neural networks can implicitly assimilate sparse in situ tide gauge data with coarse ocean state reanalysis in order to forecast storm surges. We curate a global dataset to learn and validate the dense prediction of storm surges, building on preceding efforts. Other than prior work limited to known gauges, our approach extends to ungauged sites, paving the way for global storm surge forecasting.
Abstract:Clouds and haze often occlude optical satellite images, hindering continuous, dense monitoring of the Earth's surface. Although modern deep learning methods can implicitly learn to ignore such occlusions, explicit cloud removal as pre-processing enables manual interpretation and allows training models when only few annotations are available. Cloud removal is challenging due to the wide range of occlusion scenarios -- from scenes partially visible through haze, to completely opaque cloud coverage. Furthermore, integrating reconstructed images in downstream applications would greatly benefit from trustworthy quality assessment. In this paper, we introduce UnCRtainTS, a method for multi-temporal cloud removal combining a novel attention-based architecture, and a formulation for multivariate uncertainty prediction. These two components combined set a new state-of-the-art performance in terms of image reconstruction on two public cloud removal datasets. Additionally, we show how the well-calibrated predicted uncertainties enable a precise control of the reconstruction quality.
Abstract:In this paper, we introduce Planet-CR, a benchmark dataset for high-resolution cloud removal with multi-modal and multi-resolution data fusion. Planet-CR is the first public dataset for cloud removal to feature globally sampled high resolution optical observations, in combination with paired radar measurements as well as pixel-level land cover annotations. It provides solid basis for exhaustive evaluation in terms of generating visually pleasing textures and semantically meaningful structures. With this dataset, we consider the problem of cloud removal in high resolution optical remote sensing imagery by integrating multi-modal and multi-resolution information. Existing multi-modal data fusion based methods, which assume the image pairs are aligned pixel-to-pixel, are hence not appropriate for this problem. To this end, we design a new baseline named Align-CR to perform the low-resolution SAR image guided high-resolution optical image cloud removal. It implicitly aligns the multi-modal and multi-resolution data during the reconstruction process to promote the cloud removal performance. The experimental results demonstrate that the proposed Align-CR method gives the best performance in both visual recovery quality and semantic recovery quality. The project is available at https://github.com/zhu-xlab/Planet-CR, and hope this will inspire future research.
Abstract:With modern infotainment systems, drivers are increasingly tempted to engage in secondary tasks while driving. Since distracted driving is already one of the main causes of fatal accidents, in-vehicle touchscreen Human-Machine Interfaces (HMIs) must be as little distracting as possible. To ensure that these systems are safe to use, they undergo elaborate and expensive empirical testing, requiring fully functional prototypes. Thus, early-stage methods informing designers about the implication their design may have on driver distraction are of great value. This paper presents a machine learning method that, based on anticipated usage scenarios, predicts the visual demand of in-vehicle touchscreen interactions and provides local and global explanations of the factors influencing drivers' visual attention allocation. The approach is based on large-scale natural driving data continuously collected from production line vehicles and employs the SHapley Additive exPlanation (SHAP) method to provide explanations leveraging informed design decisions. Our approach is more accurate than related work and identifies interactions during which long glances occur with 68 % accuracy and predicts the total glance duration with a mean error of 2.4 s. Our explanations replicate the results of various recent studies and provide fast and easily accessible insights into the effect of UI elements, driving automation, and vehicle speed on driver distraction. The system can not only help designers to evaluate current designs but also help them to better anticipate and understand the implications their design decisions might have on future designs.
Abstract:The challenge of the cloud removal task can be alleviated with the aid of Synthetic Aperture Radar (SAR) images that can penetrate cloud cover. However, the large domain gap between optical and SAR images as well as the severe speckle noise of SAR images may cause significant interference in SAR-based cloud removal, resulting in performance degeneration. In this paper, we propose a novel global-local fusion based cloud removal (GLF-CR) algorithm to leverage the complementary information embedded in SAR images. Exploiting the power of SAR information to promote cloud removal entails two aspects. The first, global fusion, guides the relationship among all local optical windows to maintain the structure of the recovered region consistent with the remaining cloud-free regions. The second, local fusion, transfers complementary information embedded in the SAR image that corresponds to cloudy areas to generate reliable texture details of the missing regions, and uses dynamic filtering to alleviate the performance degradation caused by speckle noise. Extensive evaluation demonstrates that the proposed algorithm can yield high quality cloud-free images and performs favorably against state-of-the-art cloud removal algorithms.
Abstract:About half of all optical observations collected via spaceborne satellites are affected by haze or clouds. Consequently, cloud coverage affects the remote sensing practitioner's capabilities of a continuous and seamless monitoring of our planet. This work addresses the challenge of optical satellite image reconstruction and cloud removal by proposing a novel multi-modal and multi-temporal data set called SEN12MS-CR-TS. We propose two models highlighting the benefits and use cases of SEN12MS-CR-TS: First, a multi-modal multi-temporal 3D-Convolution Neural Network that predicts a cloud-free image from a sequence of cloudy optical and radar images. Second, a sequence-to-sequence translation model that predicts a cloud-free time series from a cloud-covered time series. Both approaches are evaluated experimentally, with their respective models trained and tested on SEN12MS-CR-TS. The conducted experiments highlight the contribution of our data set to the remote sensing community as well as the benefits of multi-modal and multi-temporal information to reconstruct noisy information. Our data set is available at https://patrickTUM.github.io/cloud_removal
Abstract:Multimodal and multisensor data analysis is a long-standing goal in machine learning research. In this paper we revisit multisensor analysis in context of self-supervised change detection in bi-temporal satellite images. Most change detection methods assume that pre-change and post-change images are acquired by the same sensor. However, in many real-life scenarios, e.g., natural disaster, it is more practical to use the latest available images before and after the occurrence of incidence, which may be acquired using different sensors. In particular, we are interested in the combination of the images acquired by optical and Synthetic Aperture Radar (SAR) sensors. While optical images are like the natural images dealt in computer vision, SAR images appear vastly different even when capturing the same scene. Adding to this, change detection methods are often constrained to use only target image-pair, no labeled data, and no additional unlabeled data. Such constraints limit the scope of traditional supervised machine learning and unsupervised generative approaches for multi-sensor change detection. Recent rapid development of self-supervised learning methods has shown that some of them can even work with only few images. Motivated by this, in this work we propose a method for multi-sensor change detection using only the unlabeled target bi-temporal images that are used for training a network in self-supervised fashion by using deep clustering and contrastive learning. The trained network is evaluated on multi-modal satellite data showing change and the benefits of our self-supervised approach are demonstrated.
Abstract:This work has been accepted by IEEE TGRS for publication. The majority of optical observations acquired via spaceborne earth imagery are affected by clouds. While there is numerous prior work on reconstructing cloud-covered information, previous studies are oftentimes confined to narrowly-defined regions of interest, raising the question of whether an approach can generalize to a diverse set of observations acquired at variable cloud coverage or in different regions and seasons. We target the challenge of generalization by curating a large novel data set for training new cloud removal approaches and evaluate on two recently proposed performance metrics of image quality and diversity. Our data set is the first publically available to contain a global sample of co-registered radar and optical observations, cloudy as well as cloud-free. Based on the observation that cloud coverage varies widely between clear skies and absolute coverage, we propose a novel model that can deal with either extremes and evaluate its performance on our proposed data set. Finally, we demonstrate the superiority of training models on real over synthetic data, underlining the need for a carefully curated data set of real observations. To facilitate future research, our data set is made available online
Abstract:Requirements elicitation has recently been complemented with crowd-based techniques, which continuously involve large, heterogeneous groups of users who express their feedback through a variety of media. Crowd-based elicitation has great potential for engaging with (potential) users early on but also results in large sets of raw and unstructured feedback. Consolidating and analyzing this feedback is a key challenge for turning it into sensible user requirements. In this paper, we focus on topic modeling as a means to identify topics within a large set of crowd-generated user stories and compare three approaches: (1) a traditional approach based on Latent Dirichlet Allocation, (2) a combination of word embeddings and principal component analysis, and (3) a combination of word embeddings and Word Mover's Distance. We evaluate the approaches on a publicly available set of 2,966 user stories written and categorized by crowd workers. We found that a combination of word embeddings and Word Mover's Distance is most promising. Depending on the word embeddings we use in our approaches, we manage to cluster the user stories in two ways: one that is closer to the original categorization and another that allows new insights into the dataset, e.g. to find potentially new categories. Unfortunately, no measure exists to rate the quality of our results objectively. Still, our findings provide a basis for future work towards analyzing crowd-sourced user stories.