Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Benjamin Bischke

Vision Impulse GmbH and DFKI, Germany

Adaptive Fusion of Multi-view Remote Sensing data for Optimal Sub-field Crop Yield Prediction

Jan 22, 2024

Francisco Mena, Deepak Pathak, Hiba Najjar, Cristhian Sanchez, Patrick Helber, Benjamin Bischke, Peter Habelitz, Miro Miranda, Jayanth Siddamsetty, Marlon Nuske(+4 more)

Figure 1 for Adaptive Fusion of Multi-view Remote Sensing data for Optimal Sub-field Crop Yield Prediction

Figure 2 for Adaptive Fusion of Multi-view Remote Sensing data for Optimal Sub-field Crop Yield Prediction

Figure 3 for Adaptive Fusion of Multi-view Remote Sensing data for Optimal Sub-field Crop Yield Prediction

Figure 4 for Adaptive Fusion of Multi-view Remote Sensing data for Optimal Sub-field Crop Yield Prediction

Abstract:Accurate crop yield prediction is of utmost importance for informed decision-making in agriculture, aiding farmers, and industry stakeholders. However, this task is complex and depends on multiple factors, such as environmental conditions, soil properties, and management practices. Combining heterogeneous data views poses a fusion challenge, like identifying the view-specific contribution to the predictive task. We present a novel multi-view learning approach to predict crop yield for different crops (soybean, wheat, rapeseed) and regions (Argentina, Uruguay, and Germany). Our multi-view input data includes multi-spectral optical images from Sentinel-2 satellites and weather data as dynamic features during the crop growing season, complemented by static features like soil properties and topographic information. To effectively fuse the data, we introduce a Multi-view Gated Fusion (MVGF) model, comprising dedicated view-encoders and a Gated Unit (GU) module. The view-encoders handle the heterogeneity of data sources with varying temporal resolutions by learning a view-specific representation. These representations are adaptively fused via a weighted sum. The fusion weights are computed for each sample by the GU using a concatenation of the view-representations. The MVGF model is trained at sub-field level with 10 m resolution pixels. Our evaluations show that the MVGF outperforms conventional models on the same task, achieving the best results by incorporating all the data sources, unlike the usual fusion results in the literature. For Argentina, the MVGF model achieves an R2 value of 0.68 at sub-field yield prediction, while at field level evaluation (comparing field averages), it reaches around 0.80 across different countries. The GU module learned different weights based on the country and crop-type, aligning with the variable significance of each data source to the prediction task.

Via

Access Paper or Ask Questions

Predicting Crop Yield With Machine Learning: An Extensive Analysis Of Input Modalities And Models On a Field and sub-field Level

Aug 17, 2023

Deepak Pathak, Miro Miranda, Francisco Mena, Cristhian Sanchez, Patrick Helber, Benjamin Bischke, Peter Habelitz, Hiba Najjar, Jayanth Siddamsetty, Diego Arenas(+4 more)

Figure 1 for Predicting Crop Yield With Machine Learning: An Extensive Analysis Of Input Modalities And Models On a Field and sub-field Level

Figure 2 for Predicting Crop Yield With Machine Learning: An Extensive Analysis Of Input Modalities And Models On a Field and sub-field Level

Figure 3 for Predicting Crop Yield With Machine Learning: An Extensive Analysis Of Input Modalities And Models On a Field and sub-field Level

Figure 4 for Predicting Crop Yield With Machine Learning: An Extensive Analysis Of Input Modalities And Models On a Field and sub-field Level

Abstract:We introduce a simple yet effective early fusion method for crop yield prediction that handles multiple input modalities with different temporal and spatial resolutions. We use high-resolution crop yield maps as ground truth data to train crop and machine learning model agnostic methods at the sub-field level. We use Sentinel-2 satellite imagery as the primary modality for input data with other complementary modalities, including weather, soil, and DEM data. The proposed method uses input modalities available with global coverage, making the framework globally scalable. We explicitly highlight the importance of input modalities for crop yield prediction and emphasize that the best-performing combination of input modalities depends on region, crop, and chosen model.

* 4 pages, 1 figure, 3 tables, IEEE IGARSS 2023

Via

Access Paper or Ask Questions

RapidAI4EO: Mono- and Multi-temporal Deep Learning models for Updating the CORINE Land Cover Product

Oct 26, 2022

Priyash Bhugra, Benjamin Bischke, Christoph Werner, Robert Syrnicki, Carolin Packbier, Patrick Helber, Caglar Senaras, Akhil Singh Rana, Tim Davis, Wanda De Keersmaecker(+4 more)

Figure 1 for RapidAI4EO: Mono- and Multi-temporal Deep Learning models for Updating the CORINE Land Cover Product

Figure 2 for RapidAI4EO: Mono- and Multi-temporal Deep Learning models for Updating the CORINE Land Cover Product

Figure 3 for RapidAI4EO: Mono- and Multi-temporal Deep Learning models for Updating the CORINE Land Cover Product

Figure 4 for RapidAI4EO: Mono- and Multi-temporal Deep Learning models for Updating the CORINE Land Cover Product

Abstract:In the remote sensing community, Land Use Land Cover (LULC) classification with satellite imagery is a main focus of current research activities. Accurate and appropriate LULC classification, however, continues to be a challenging task. In this paper, we evaluate the performance of multi-temporal (monthly time series) compared to mono-temporal (single time step) satellite images for multi-label classification using supervised learning on the RapidAI4EO dataset. As a first step, we trained our CNN model on images at a single time step for multi-label classification, i.e. mono-temporal. We incorporated time-series images using a LSTM model to assess whether or not multi-temporal signals from satellites improves CLC classification. The results demonstrate an improvement of approximately 0.89% in classifying satellite imagery on 15 classes using a multi-temporal approach on monthly time series images compared to the mono-temporal approach. Using features from multi-temporal or mono-temporal images, this work is a step towards an efficient change detection and land monitoring approach.

* Published in IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium

Via

Access Paper or Ask Questions

RapidAI4EO: A Corpus for Higher Spatial and Temporal Reasoning

Oct 05, 2021

Giovanni Marchisio, Patrick Helber, Benjamin Bischke, Timothy Davis, Caglar Senaras, Daniele Zanaga, Ruben Van De Kerchove, Annett Wania

Figure 1 for RapidAI4EO: A Corpus for Higher Spatial and Temporal Reasoning

Figure 2 for RapidAI4EO: A Corpus for Higher Spatial and Temporal Reasoning

Figure 3 for RapidAI4EO: A Corpus for Higher Spatial and Temporal Reasoning

Abstract:Under the sponsorship of the European Union Horizon 2020 program, RapidAI4EO will establish the foundations for the next generation of Copernicus Land Monitoring Service (CLMS) products. The project aims to provide intensified monitoring of Land Use (LU), Land Cover (LC), and LU change at a much higher level of detail and temporal cadence than it is possible today. Focus is on disentangling phenology from structural change and in providing critical training data to drive advancement in the Copernicus community and ecosystem well beyond the lifetime of this project. To this end we are creating the densest spatiotemporal training sets ever by fusing open satellite data with Planet imagery at as many as 500,000 patch locations over Europe and delivering high resolution daily time series at all locations. We plan to open source these datasets for the benefit of the entire remote sensing community.

* Published in the IEEE 2021 International Geoscience & Remote Sensing Symposium (IGARSS 2021)

Via

Access Paper or Ask Questions

Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Apr 25, 2020

Fatemeh Azimi, Benjamin Bischke, Sebastian Palacio, Federico Raue, Joern Hees, Andreas Dengel

Figure 1 for Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Figure 2 for Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Figure 3 for Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Figure 4 for Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Abstract:Video Object Segmentation (VOS) is an active research area of the visual domain. One of its fundamental sub-tasks is semi-supervised / one-shot learning: given only the segmentation mask for the first frame, the task is to provide pixel-accurate masks for the object over the rest of the sequence. Despite much progress in the last years, we noticed that many of the existing approaches lose objects in longer sequences, especially when the object is small or briefly occluded. In this work, we build upon a sequence-to-sequence approach that employs an encoder-decoder architecture together with a memory module for exploiting the sequential data. We further improve this approach by proposing a model that manipulates multi-scale spatio-temporal information using memory-equipped skip connections. Furthermore, we incorporate an auxiliary task based on distance classification which greatly enhances the quality of edges in segmentation masks. We compare our approach to the state of the art and show considerable improvement in the contour accuracy metric and the overall segmentation accuracy.

Via

Access Paper or Ask Questions

Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery

Dec 05, 2018

Tim G. J. Rudner, Marc Rußwurm, Jakub Fil, Ramona Pelich, Benjamin Bischke, Veronika Kopackova, Piotr Bilinski

$Figure 1 for Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery$

$Figure 2 for Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery$

$Figure 3 for Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery$

$Figure 4 for Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery$

Abstract:We propose a novel approach for rapid segmentation of flooded buildings by fusing multiresolution, multisensor, and multitemporal satellite imagery in a convolutional neural network. Our model significantly expedites the generation of satellite imagery-based flood maps, crucial for first responders and local authorities in the early stages of flood events. By incorporating multitemporal satellite imagery, our model allows for rapid and accurate post-disaster damage assessment and can be used by governments to better coordinate medium- and long-term financial assistance programs for affected areas. The network consists of multiple streams of encoder-decoder architectures that extract spatiotemporal information from medium-resolution images and spatial information from high-resolution images before fusing the resulting representations into a single medium-resolution segmentation map of flooded buildings. We compare our model to state-of-the-art methods for building footprint segmentation as well as to alternative fusion approaches for the segmentation of flooded buildings and find that our model performs best on both tasks. We also demonstrate that our model produces highly accurate segmentation maps of flooded buildings using only publicly available medium-resolution data instead of significantly more detailed but sparsely available very high-resolution data. We release the first open-source dataset of fully preprocessed and labeled multiresolution, multispectral, and multitemporal satellite images of disaster sites along with our source code.

* To appear in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)

Via

Access Paper or Ask Questions

Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

Aug 09, 2018

Benjamin Bischke, Patrick Helber, Florian König, Damian Borth, Andreas Dengel

Figure 1 for Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

Figure 2 for Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

Figure 3 for Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

Figure 4 for Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

Abstract:The integration of information acquired with different modalities, spatial resolution and spectral bands has shown to improve predictive accuracies. Data fusion is therefore one of the key challenges in remote sensing. Most prior work focusing on multi-modal fusion, assumes that modalities are always available during inference. This assumption limits the applications of multi-modal models since in practice the data collection process is likely to generate data with missing, incomplete or corrupted modalities. In this paper, we show that Generative Adversarial Networks can be effectively used to overcome the problems that arise when modalities are missing or incomplete. Focusing on semantic segmentation of building footprints with missing modalities, our approach achieves an improvement of about 2% on the Intersection over Union (IoU) against the same network that relies only on the available modality.

Via

Access Paper or Ask Questions

Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks

Sep 18, 2017

Benjamin Bischke, Patrick Helber, Joachim Folz, Damian Borth, Andreas Dengel

Figure 1 for Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks

Figure 2 for Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks

Figure 3 for Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks

Figure 4 for Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks

Abstract:The increased availability of high resolution satellite imagery allows to sense very detailed structures on the surface of our planet. Access to such information opens up new directions in the analysis of remote sensing imagery. However, at the same time this raises a set of new challenges for existing pixel-based prediction methods, such as semantic segmentation approaches. While deep neural networks have achieved significant advances in the semantic segmentation of high resolution images in the past, most of the existing approaches tend to produce predictions with poor boundaries. In this paper, we address the problem of preserving semantic segmentation boundaries in high resolution satellite imagery by introducing a new cascaded multi-task loss. We evaluate our approach on Inria Aerial Image Labeling Dataset which contains large-scale and high resolution images. Our results show that we are able to outperform state-of-the-art methods by 8.3\% without any additional post-processing step.

Via

Access Paper or Ask Questions