Abstract:Water quality is of great importance for humans and for the environment and has to be monitored continuously. It is determinable through proxies such as the chlorophyll $a$ concentration, which can be monitored by remote sensing techniques. This study focuses on the trade-off between the spatial and the spectral resolution of six simulated satellite-based data sets when estimating the chlorophyll $a$ concentration with supervised machine learning models. The initial dataset for the spectral simulation of the satellite missions contains spectrometer data and measured chlorophyll $a$ concentration of 13 different inland waters. Focusing on the regression performance, it appears that the machine learning models achieve almost as good results with the simulated Sentinel data as with the simulated hyperspectral data. Regarding the applicability, the Sentinel 2 mission is the best choice for small inland waters due to its high spatial and temporal resolution in combination with a suitable spectral resolution.
Abstract:Water is a key component of life, the natural environment and human health. For monitoring the conditions of a water body, the chlorophyll a concentration can serve as a proxy for nutrients and oxygen supply. In situ measurements of water quality parameters are often time-consuming, expensive and limited in areal validity. Therefore, we apply remote sensing techniques. During field campaigns, we collected hyperspectral data with a spectrometer and in situ measured chlorophyll a concentrations of 13 inland water bodies with different spectral characteristics. One objective of this study is to estimate chlorophyll a concentrations of these inland waters by applying three machine learning regression models: Random Forest, Support Vector Machine and an Artificial Neural Network. Additionally, we simulate four different hyperspectral resolutions of the spectrometer data to investigate the effects on the estimation performance. Furthermore, the application of first order derivatives of the spectra is evaluated in turn to the regression performance. This study reveals the potential of combining machine learning approaches and remote sensing data for inland waters. Each machine learning model achieves an R2-score between 80 % to 90 % for the regression on chlorophyll a concentrations. The random forest model benefits clearly from the applied derivatives of the spectra. In further studies, we will focus on the application of machine learning models on spectral satellite data to enhance the area-wide estimation of chlorophyll a concentration for inland waters.
Abstract:In many research fields, the sizes of the existing datasets vary widely. Hence, there is a need for machine learning techniques which are well-suited for these different datasets. One possible technique is the self-organizing map (SOM), a type of artificial neural network which is, so far, weakly represented in the field of machine learning. The SOM's unique characteristic is the neighborhood relationship of the output neurons. This relationship improves the ability of generalization on small datasets. SOMs are mostly applied in unsupervised learning and few studies focus on using SOMs as supervised learning approach. Furthermore, no appropriate SOM package is available with respect to machine learning standards and in the widely used programming language Python. In this paper, we introduce the freely available SUpervised Self-organIzing maps (SUSI) Python package which performs supervised regression and classification. The implementation of SUSI is described with respect to the underlying mathematics. Then, we present first evaluations of the SOM for regression and classification datasets from two different domains of geospatial image analysis. Despite the early stage of its development, the SUSI framework performs well and is characterized by only small performance differences between the training and the test datasets. A comparison of the SUSI framework with existing Python and R packages demonstrates the importance of the SUSI framework. In future work, the SUSI framework will be extended, optimized and upgraded e.g. with tools to better understand and visualize the input data as well as the handling of missing and incomplete data.
Abstract:Soil texture is important for many environmental processes. In this paper, we study the classification of soil texture based on hyperspectral data. We develop and implement three 1-dimensional (1D) convolutional neural networks (CNN): the LucasCNN, the LucasResNet which contains an identity block as residual network, and the LucasCoordConv with an additional coordinates layer. Furthermore, we modify two existing 1D CNN approaches for the presented classification task. The code of all five CNN approaches is available on GitHub (Riese, 2019). We evaluate the performance of the CNN approaches and compare them to a random forest classifier. Thereby, we rely on the freely available LUCAS topsoil dataset. The CNN approach with the least depth turns out to be the best performing classifier. The LucasCoordConv achieves the best performance regarding the average accuracy. In future work, we can further enhance the introduced LucasCNN, LucasResNet and LucasCoordConv and include additional variables of the rich LUCAS dataset.
Abstract:In this contribution, we investigate the potential of hyperspectral data combined with either simulated ground penetrating radar (GPR) or simulated (sensor-like) soil-moisture data to estimate soil moisture. We propose two simulation approaches to extend a given multi-sensor dataset which contains sparse GPR data. In the first approach, simulated GPR data is generated either by an interpolation along the time axis or by a machine learning model. The second approach includes the simulation of soil-moisture along the GPR profile. The soil-moisture estimation is improved significantly by the fusion of hyperspectral and GPR data. In contrast, the combination of simulated, sensor-like soil-moisture values and hyperspectral data achieves the worst regression performance. In conclusion, the estimation of soil moisture with hyperspectral and GPR data engages further investigations.
Abstract:In this paper, we present a regression framework involving several machine learning models to estimate water parameters based on hyperspectral data. Measurements from a multi-sensor field campaign, conducted on the River Elbe, Germany, represent the benchmark dataset. It contains hyperspectral data and the five water parameters chlorophyll a, green algae, diatoms, CDOM and turbidity. We apply a PCA for the high-dimensional data as a possible preprocessing step. Then, we evaluate the performance of the regression framework with and without this preprocessing step. The regression results of the framework clearly reveal the potential of estimating water parameters based on hyperspectral data with machine learning. The proposed framework provides the basis for further investigations, such as adapting the framework to estimate water parameters of different inland waters.
Abstract:In this paper, we investigate the potential of estimating the soil-moisture content based on VNIR hyperspectral data combined with LWIR data. Measurements from a multi-sensor field campaign represent the benchmark dataset which contains measured hyperspectral, LWIR, and soil-moisture data conducted on grassland site. We introduce a regression framework with three steps consisting of feature selection, preprocessing, and well-chosen regression models. The latter are mainly supervised machine learning models. An exception are the self-organizing maps which combine unsupervised and supervised learning. We analyze the impact of the distinct preprocessing methods on the regression results. Of all regression models, the extremely randomized trees model without preprocessing provides the best estimation performance. Our results reveal the potential of the respective regression framework combined with the VNIR hyperspectral data to estimate soil moisture measured under real-world conditions. In conclusion, the results of this paper provide a basis for further improvements in different research directions.