Abstract:We present a data-driven approach for forecasting global weather using graph neural networks. The system learns to step forward the current 3D atmospheric state by six hours, and multiple steps are chained together to produce skillful forecasts going out several days into the future. The underlying model is trained on reanalysis data from ERA5 or forecast data from GFS. Test performance on metrics such as Z500 (geopotential height) and T850 (temperature) improves upon previous data-driven approaches and is comparable to operational, full-resolution, physical models from GFS and ECMWF, at least when evaluated on 1-degree scales and when using reanalysis initial conditions. We also show results from connecting this data-driven model to live, operational forecasts from GFS.
Abstract:We present a system for performing visual search over billions of aerial and satellite images. The purpose of visual search is to find images that are visually similar to a query image. We define visual similarity using 512 abstract visual features generated by a convolutional neural network that has been trained on aerial and satellite imagery. The features are converted to binary values to reduce data and compute requirements. We employ a hash-based search using Bigtable, a scalable database service from Google Cloud. Searching the continental United States at 1-meter pixel resolution, corresponding to approximately 2 billion images, takes approximately 0.1 seconds. This system enables real-time visual search over the surface of the earth, and an interactive demo is available at https://search.descarteslabs.com.
Abstract:We present our experiences using cloud computing to support data-intensive analytics on satellite imagery for commercial applications. Drawing from our background in high-performance computing, we draw parallels between the early days of clustered computing systems and the current state of cloud computing and its potential to disrupt the HPC market. Using our own virtual file system layer on top of cloud remote object storage, we demonstrate aggregate read bandwidth of 230 gigabytes per second using 512 Google Compute Engine (GCE) nodes accessing a USA multi-region standard storage bucket. This figure is comparable to the best HPC storage systems in existence. We also present several of our application results, including the identification of field boundaries in Ukraine, and the generation of a global cloud-free base layer from Landsat imagery.