Wageningen University and Research, the Netherlands
Abstract:Crop breeding is crucial in improving agricultural productivity while potentially decreasing land usage, greenhouse gas emissions, and water consumption. However, breeding programs are challenging due to long turnover times, high-dimensional decision spaces, long-term objectives, and the need to adapt to rapid climate change. This paper introduces the use of Reinforcement Learning (RL) to optimize simulated crop breeding programs. RL agents are trained to make optimal crop selection and cross-breeding decisions based on genetic information. To benchmark RL-based breeding algorithms, we introduce a suite of Gym environments. The study demonstrates the superiority of RL techniques over standard practices in terms of genetic gain when simulated in silico using real-world genomic maize data.
Abstract:Crop yield prediction typically involves the utilization of either theory-driven process-based crop growth models, which have proven to be difficult to calibrate for local conditions, or data-driven machine learning methods, which are known to require large datasets. In this work we investigate potato yield prediction using a hybrid meta-modeling approach. A crop growth model is employed to generate synthetic data for (pre)training a convolutional neural net, which is then fine-tuned with observational data. When applied in silico, our meta-modeling approach yields better predictions than a baseline comprising a purely data-driven approach. When tested on real-world data from field trials (n=303) and commercial fields (n=77), the meta-modeling approach yields competitive results with respect to the crop growth model. In the latter set, however, both models perform worse than a simple linear regression with a hand-picked feature set and dedicated preprocessing designed by domain experts. Our findings indicate the potential of meta-modeling for accurate crop yield prediction; however, further advancements and validation using extensive real-world datasets is recommended to solidify its practical effectiveness.
Abstract:Learning latent representations has aided operational decision-making in several disciplines. Its advantages include uncovering hidden interactions in data and automating procedures which were performed manually in the past. Representation learning is also being adopted by earth and environmental sciences. However, there are still subfields that depend on manual feature engineering based on expert knowledge and the use of algorithms which do not utilize the latent space. Relying on those techniques can inhibit operational decision-making since they impose data constraints and inhibit automation. In this work, we adopt a case study for nitrogen response rate prediction and examine if representation learning can be used for operational use. We compare a Multilayer Perceptron, an Autoencoder, and a dual-head Autoencoder with a reference Random Forest model for nitrogen response rate prediction. To bring the predictions closer to an operational setting we assume absence of future weather data, and we are evaluating the models using error metrics and a domain-derived error threshold. The results show that learning latent representations can provide operational nitrogen response rate predictions by offering performance equal and sometimes better than the reference model.
Abstract:Predictor inputs and label data for crop yield forecasting are not always available at the same spatial resolution. We propose a deep learning framework that uses high resolution inputs and low resolution labels to produce crop yield forecasts for both spatial levels. The forecasting model is calibrated by weak supervision from low resolution crop area and yield statistics. We evaluated the framework by disaggregating regional yields in Europe from parent statistical regions to sub-regions for five countries (Germany, Spain, France, Hungary, Italy) and two crops (soft wheat and potatoes). Performance of weakly supervised models was compared with linear trend models and Gradient-Boosted Decision Trees (GBDT). Higher resolution crop yield forecasts are useful to policymakers and other stakeholders. Weakly supervised deep learning methods provide a way to produce such forecasts even in the absence of high resolution yield data.
Abstract:Nitrogen fertilizers have a detrimental effect on the environment, which can be reduced by optimizing fertilizer management strategies. We implement an OpenAI Gym environment where a reinforcement learning agent can learn fertilization management policies using process-based crop growth models and identify policies with reduced environmental impact. In our environment, an agent trained with the Proximal Policy Optimization algorithm is more successful at reducing environmental impacts than the other baseline agents we present.