Abstract:Existing Deep Learning (DL) frameworks typically do not provide ready-to-use solutions for robotics, where very specific learning, reasoning, and embodiment problems exist. Their relatively steep learning curve and the different methodologies employed by DL compared to traditional approaches, along with the high complexity of DL models, which often leads to the need of employing specialized hardware accelerators, further increase the effort and cost needed to employ DL models in robotics. Also, most of the existing DL methods follow a static inference paradigm, as inherited by the traditional computer vision pipelines, ignoring active perception, which can be employed to actively interact with the environment in order to increase perception accuracy. In this paper, we present the Open Deep Learning Toolkit for Robotics (OpenDR). OpenDR aims at developing an open, non-proprietary, efficient, and modular toolkit that can be easily used by robotics companies and research institutions to efficiently develop and deploy AI and cognition technologies to robotics applications, providing a solid step towards addressing the aforementioned challenges. We also detail the design choices, along with an abstract interface that was created to overcome these challenges. This interface can describe various robotic tasks, spanning beyond traditional DL cognition and inference, as known by existing frameworks, incorporating openness, homogeneity and robotics-oriented perception e.g., through active perception, as its core design principles.
Abstract:Forecasting the formation and development of clouds is a central element of modern weather forecasting systems. Incorrect clouds forecasts can lead to major uncertainty in the overall accuracy of weather forecasts due to their intrinsic role in the Earth's climate system. Few studies have tackled this challenging problem from a machine learning point-of-view due to a shortage of high-resolution datasets with many historical observations globally. In this paper, we present a novel satellite-based dataset called "CloudCast". It consists of 70080 images with 10 different cloud types for multiple layers of the atmosphere annotated on a pixel level. The spatial resolution of the dataset is 928 x 1530 pixels (3x3 km per pixel) with 15-min intervals between frames for the period 2017-01-01 to 2018-12-31. All frames are centered and projected over Europe. To supplement the dataset, we conduct an evaluation study with current state-of-the-art video prediction methods such as convolutional long short-term memory networks, generative adversarial networks, and optical flow-based extrapolation methods. As the evaluation of video prediction is difficult in practice, we aim for a thorough evaluation in the spatial and temporal domain. Our benchmark models show promising results but with ample room for improvement. This is the first publicly available dataset with high-resolution cloud types on a high temporal granularity to the authors' best knowledge.