Abstract:Recent advancements in Earth system science have been marked by the exponential increase in the availability of diverse, multivariate datasets characterised by moderate to high spatio-temporal resolutions. Earth System Data Cubes (ESDCs) have emerged as one suitable solution for transforming this flood of data into a simple yet robust data structure. ESDCs achieve this by organising data into an analysis-ready format aligned with a spatio-temporal grid, facilitating user-friendly analysis and diminishing the need for extensive technical data processing knowledge. Despite these significant benefits, the completion of the entire ESDC life cycle remains a challenging task. Obstacles are not only of a technical nature but also relate to domain-specific problems in Earth system research. There exist barriers to realising the full potential of data collections in light of novel cloud-based technologies, particularly in curating data tailored for specific application domains. These include transforming data to conform to a spatio-temporal grid with minimum distortions and managing complexities such as spatio-temporal autocorrelation issues. Addressing these challenges is pivotal for the effective application of Artificial Intelligence (AI) approaches. Furthermore, adhering to open science principles for data dissemination, reproducibility, visualisation, and reuse is crucial for fostering sustainable research. Overcoming these challenges offers a substantial opportunity to advance data-driven Earth system research, unlocking the full potential of an integrated, multidimensional view of Earth system processes. This is particularly true when such research is coupled with innovative research paradigms and technological progress.
Abstract:The Sentinel-2 (S2) mission from the European Space Agency's Copernicus program provides essential data for Earth surface analysis. Its Level-2A products deliver high-to-medium resolution (10-60 m) surface reflectance (SR) data through the MultiSpectral Instrument (MSI). To enhance the accuracy and comparability of SR data, adjustments simulating a nadir viewing perspective are essential. These corrections address the anisotropic nature of SR and the variability in sun and observation angles, ensuring consistent image comparisons over time and under different conditions. The $c$-factor method, a simple yet effective algorithm, adjusts observed S2 SR by using the MODIS BRDF model to achieve Nadir BRDF Adjusted Reflectance (NBAR). Despite the straightforward application of the $c$-factor to individual images, a cohesive Python framework for its application across multiple S2 images and Earth System Data Cubes (ESDCs) from cloud-stored data has been lacking. Here we introduce sen2nbar, a Python package crafted to convert S2 SR data to NBAR, supporting both individual images and ESDCs derived from cloud-stored data. This package simplifies the conversion of S2 SR data to NBAR via a single function, organized into modules for efficient process management. By facilitating NBAR conversion for both SAFE files and ESDCs from SpatioTemporal Asset Catalogs (STAC), sen2nbar is developed as a flexible tool that can handle diverse data format requirements. We anticipate that sen2nbar will considerably contribute to the standardization and harmonization of S2 data, offering a robust solution for a diverse range of users across various applications. sen2nbar is an open-source tool available at https://github.com/ESDS-Leipzig/sen2nbar.
Abstract:Advancements in Earth system science have seen a surge in diverse datasets. Earth System Data Cubes (ESDCs) have been introduced to efficiently handle this influx of high-dimensional data. ESDCs offer a structured, intuitive framework for data analysis, organising information within spatio-temporal grids. The structured nature of ESDCs unlocks significant opportunities for Artificial Intelligence (AI) applications. By providing well-organised data, ESDCs are ideally suited for a wide range of sophisticated AI-driven tasks. An automated framework for creating AI-focused ESDCs with minimal user input could significantly accelerate the generation of task-specific training data. Here we introduce cubo, an open-source Python tool designed for easy generation of AI-focused ESDCs. Utilising collections in SpatioTemporal Asset Catalogs (STAC) that are stored as Cloud Optimised GeoTIFFs (COGs), cubo efficiently creates ESDCs, requiring only central coordinates, spatial resolution, edge size, and time range.
Abstract:Accurate quantification of Gross Primary Production (GPP) is crucial for understanding terrestrial carbon dynamics. It represents the largest atmosphere-to-land CO$_2$ flux, especially significant for forests. Eddy Covariance (EC) measurements are widely used for ecosystem-scale GPP quantification but are globally sparse. In areas lacking local EC measurements, remote sensing (RS) data are typically utilised to estimate GPP after statistically relating them to in-situ data. Deep learning offers novel perspectives, and the potential of recurrent neural network architectures for estimating daily GPP remains underexplored. This study presents a comparative analysis of three architectures: Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs), and Long-Short Term Memory (LSTMs). Our findings reveal comparable performance across all models for full-year and growing season predictions. Notably, LSTMs outperform in predicting climate-induced GPP extremes. Furthermore, our analysis highlights the importance of incorporating radiation and RS inputs (optical, temperature, and radar) for accurate GPP predictions, particularly during climate extremes.