Abstract:With climate change expected to exacerbate fire weather conditions, the accurate anticipation of wildfires on a global scale becomes increasingly crucial for disaster mitigation. In this study, we utilize SeasFire, a comprehensive global wildfire dataset with climate, vegetation, oceanic indices, and human-related variables, to enable seasonal wildfire forecasting with machine learning. For the predictive analysis, we train deep learning models with different architectures that capture the spatio-temporal context leading to wildfires. Our investigation focuses on assessing the effectiveness of these models in predicting the presence of burned areas at varying forecasting time horizons globally, extending up to six months into the future, and on how different spatial or/and temporal context affects the performance of the models. Our findings demonstrate the great potential of deep learning models in seasonal fire forecasting; longer input time-series leads to more robust predictions across varying forecasting horizons, while integrating spatial information to capture wildfire spatio-temporal dynamics boosts performance. Finally, our results hint that in order to enhance performance at longer forecasting horizons, a larger receptive field spatially needs to be considered.
Abstract:Deep Learning (DL) is undergoing a paradigm shift with the emergence of foundation models, aptly named by their crucial, yet incomplete nature. In this work, we focus on Contrastive Language-Image Pre-training (CLIP), an open-vocabulary foundation model, which achieves high accuracy across many image classification tasks and is often competitive with a fully supervised baseline without being explicitly trained. Nevertheless, there are still domains where zero-shot CLIP performance is far from optimal, such as Remote Sensing (RS) and medical imagery. These domains do not only exhibit fundamentally different distributions compared to natural images, but also commonly rely on complementary modalities, beyond RGB, to derive meaningful insights. To this end, we propose a methodology for the purpose of aligning distinct RS imagery modalities with the visual and textual modalities of CLIP. Our two-stage procedure, comprises of robust fine-tuning CLIP in order to deal with the distribution shift, accompanied by the cross-modal alignment of a RS modality encoder, in an effort to extend the zero-shot capabilities of CLIP. We ultimately demonstrate our method on the tasks of RS imagery classification and cross-modal retrieval. We empirically show that both robust fine-tuning and cross-modal alignment translate to significant performance gains, across several RS benchmark datasets. Notably, these enhancements are achieved without the reliance on textual descriptions, without introducing any task-specific parameters, without training from scratch and without catastrophic forgetting.
Abstract:Global floods, exacerbated by climate change, pose severe threats to human life, infrastructure, and the environment. This urgency is highlighted by recent catastrophic events in Pakistan and New Zealand, underlining the critical need for precise flood mapping for guiding restoration efforts, understanding vulnerabilities, and preparing for future events. While Synthetic Aperture Radar (SAR) offers day-and-night, all-weather imaging capabilities, harnessing it for deep learning is hindered by the absence of a large annotated dataset. To bridge this gap, we introduce Kuro Siwo, a meticulously curated multi-temporal dataset, spanning 32 flood events globally. Our dataset maps more than 63 billion m2 of land, with 12.1 billion of them being either a flooded area or a permanent water body. Kuro Siwo stands out for its unparalleled annotation quality to facilitate rapid flood mapping in a supervised setting. We also augment learning by including a large unlabeled set of SAR samples, aimed at self-supervised pretraining. We provide an extensive benchmark and strong baselines for a diverse set of flood events from Europe, America, Africa and Australia. Our benchmark demonstrates the quality of Kuro Siwo annotations, training models that can achieve $\approx$ 85% and $\approx$ 87% in F1-score for flooded areas and general water detection respectively. This work calls on the deep learning community to develop solution-driven algorithms for rapid flood mapping, with the potential to aid civil protection and humanitarian agencies amid climate change challenges. Our code and data will be made available at https://github.com/Orion-AI-Lab/KuroSiwo
Abstract:Over the last decade there has been an increasing frequency and intensity of wildfires across the globe, posing significant threats to human and animal lives, ecosystems, and socio-economic stability. Therefore urgent action is required to mitigate their devastating impact and safeguard Earth's natural resources. Robust Machine Learning methods combined with the abundance of high-resolution satellite imagery can provide accurate and timely mappings of the affected area in order to assess the scale of the event, identify the impacted assets and prioritize and allocate resources effectively for the proper restoration of the damaged region. In this work, we create and introduce a machine-learning ready dataset we name FLOGA (Forest wiLdfire Observations for the Greek Area). This dataset is unique as it comprises of satellite imagery acquired before and after a wildfire event, it contains information from Sentinel-2 and MODIS modalities with variable spatial and spectral resolution, and contains a large number of events where the corresponding burnt area ground truth has been annotated by domain experts. FLOGA covers the wider region of Greece, which is characterized by a Mediterranean landscape and climatic conditions. We use FLOGA to provide a thorough comparison of multiple Machine Learning and Deep Learning algorithms for the automatic extraction of burnt areas, approached as a change detection task. We also compare the results to those obtained using standard specialized spectral indices for burnt area mapping. Finally, we propose a novel Deep Learning model, namely BAM-CD. Our benchmark results demonstrate the efficacy of the proposed technique in the automatic extraction of burnt areas, outperforming all other methods in terms of accuracy and robustness. Our dataset and code are publicly available at: https://github.com/Orion-AI-Lab/FLOGA.
Abstract:Wildfires are increasingly exacerbated as a result of climate change, necessitating advanced proactive measures for effective mitigation. It is important to forecast wildfires weeks and months in advance to plan forest fuel management, resource procurement and allocation. To achieve such accurate long-term forecasts at a global scale, it is crucial to employ models that account for the Earth system's inherent spatio-temporal interactions, such as memory effects and teleconnections. We propose a teleconnection-driven vision transformer (TeleViT), capable of treating the Earth as one interconnected system, integrating fine-grained local-scale inputs with global-scale inputs, such as climate indices and coarse-grained global variables. Through comprehensive experimentation, we demonstrate the superiority of TeleViT in accurately predicting global burned area patterns for various forecasting windows, up to four months in advance. The gain is especially pronounced in larger forecasting windows, demonstrating the improved ability of deep learning models that exploit teleconnections to capture Earth system dynamics. Code available at https://github.com/Orion-Ai-Lab/TeleViT.
Abstract:Climate change is expected to aggravate wildfire activity through the exacerbation of fire weather. Improving our capabilities to anticipate wildfires on a global scale is of uttermost importance for mitigating their negative effects. In this work, we create a global fire dataset and demonstrate a prototype for predicting the presence of global burned areas on a sub-seasonal scale with the use of segmentation deep learning models. Particularly, we present an open-access global analysis-ready datacube, which contains a variety of variables related to the seasonal and sub-seasonal fire drivers (climate, vegetation, oceanic indices, human-related variables), as well as the historical burned areas and wildfire emissions for 2001-2021. We train a deep learning model, which treats global wildfire forecasting as an image segmentation task and skillfully predicts the presence of burned areas 8, 16, 32 and 64 days ahead of time. Our work motivates the use of deep learning for global burned area forecasting and paves the way towards improved anticipation of global wildfire patterns.
Abstract:Synthetic Aperture Radar (SAR) data and Interferometric SAR (InSAR) products in particular, are one of the largest sources of Earth Observation data. InSAR provides unique information on diverse geophysical processes and geology, and on the geotechnical properties of man-made structures. However, there are only a limited number of applications that exploit the abundance of InSAR data and deep learning methods to extract such knowledge. The main barrier has been the lack of a large curated and annotated InSAR dataset, which would be costly to create and would require an interdisciplinary team of experts experienced on InSAR data interpretation. In this work, we put the effort to create and make available the first of its kind, manually annotated dataset that consists of 19,919 individual Sentinel-1 interferograms acquired over 44 different volcanoes globally, which are split into 216,106 InSAR patches. The annotated dataset is designed to address different computer vision problems, including volcano state classification, semantic segmentation of ground deformation, detection and classification of atmospheric signals in InSAR imagery, interferogram captioning, text to InSAR generation, and InSAR image quality assessment.
Abstract:Ground deformation measured from Interferometric Synthetic Aperture Radar (InSAR) data is considered a sign of volcanic unrest, statistically linked to a volcanic eruption. Recent studies have shown the potential of using Sentinel-1 InSAR data and supervised deep learning (DL) methods for the detection of volcanic deformation signals, towards global volcanic hazard mitigation. However, detection accuracy is compromised from the lack of labelled data and class imbalance. To overcome this, synthetic data are typically used for finetuning DL models pre-trained on the ImageNet dataset. This approach suffers from poor generalisation on real InSAR data. This letter proposes the use of self-supervised contrastive learning to learn quality visual representations hidden in unlabeled InSAR data. Our approach, based on the SimCLR framework, provides a solution that does not require a specialized architecture nor a large labelled or synthetic dataset. We show that our self-supervised pipeline achieves higher accuracy with respect to the state-of-the-art methods, and shows excellent generalisation even for out-of-distribution test data. Finally, we showcase the effectiveness of our approach for detecting the unrest episodes preceding the recent Icelandic Fagradalsfjall volcanic eruption.
Abstract:The detection of early signs of volcanic unrest preceding an eruption, in the form of ground deformation in Interferometric Synthetic Aperture Radar (InSAR) data is critical for assessing volcanic hazard. In this work we treat this as a binary classification problem of InSAR images, and propose a novel deep learning methodology that exploits a rich source of synthetically generated interferograms to train quality classifiers that perform equally well in real interferograms. The imbalanced nature of the problem, with orders of magnitude fewer positive samples, coupled with the lack of a curated database with labeled InSAR data, sets a challenging task for conventional deep learning architectures. We propose a new framework for domain adaptation, in which we learn class prototypes from synthetic data with vision transformers. We report detection accuracy that surpasses the state of the art on volcanic unrest detection. Moreover, we built upon this knowledge by learning a new, non-linear, projection between the learnt representations and prototype space, using pseudo labels produced by our model from an unlabeled real InSAR dataset. This leads to the new state of the art with $97.1%$ accuracy on our test set. We demonstrate the robustness of our approach by training a simple ResNet-18 Convolutional Neural Network on the unlabeled real InSAR dataset with pseudo-labels generated from our top transformer-prototype model. Our methodology provides a significant improvement in performance without the need of manually labeling any sample, opening the road for further exploitation of synthetic InSAR data in various remote sensing applications.
Abstract:The availability of the sheer volume of Copernicus Sentinel imagery has created new opportunities for land use land cover (LULC) mapping at large scales using deep learning. Training on such large datasets though is a non-trivial task. In this work we experiment with the BigEarthNet dataset for LULC image classification and benchmark different state-of-the-art models, including Convolution Neural Networks, Multi-Layer Perceptrons, Visual Transformers, EfficientNets and Wide Residual Networks (WRN) architectures. Our aim is to leverage classification accuracy, training time and inference rate. We propose a framework based on EfficientNets for compound scaling of WRNs in terms of network depth, width and input data resolution, for efficiently training and testing different model setups. We design a novel scaled WRN architecture enhanced with an Efficient Channel Attention mechanism. Our proposed lightweight model has an order of magnitude less trainable parameters, achieves 4.5% higher averaged f-score classification accuracy for all 19 LULC classes and is trained two times faster with respect to a ResNet50 state-of-the-art model that we use as a baseline. We provide access to more than 50 trained models, along with our code for distributed training on multiple GPU nodes.