Abstract:An open research question in deep reinforcement learning is how to focus the policy learning of key decisions within a sparse domain. This paper emphasizes combining the advantages of inputoutput hidden Markov models and reinforcement learning towards interpretable maintenance decisions. We propose a novel hierarchical-modeling methodology that, at a high level, detects and interprets the root cause of a failure as well as the health degradation of the turbofan engine, while, at a low level, it provides the optimal replacement policy. It outperforms the baseline performance of deep reinforcement learning methods applied directly to the raw data or when using a hidden Markov model without such a specialized hierarchy. It also provides comparable performance to prior work, however, with the additional benefit of interpretability.
Abstract:We present an end-to-end framework for fast retraining of object detection models in human-robot-collaboration. Our Faster R-CNN based setup covers the whole workflow of automatic image generation and labeling, model retraining on-site as well as inference on a FPGA edge device. The intervention of a human operator reduces to providing the new object together with its label and starting the training process. Moreover, we present a new loss, the intraspread-objectosphere loss, to tackle the problem of open world recognition. Though it fails to completely solve the problem, it significantly reduces the number of false positive detections of unknown objects.
Abstract:This paper presents an online transfer learning framework for improving temperature predictions in residential buildings. In transfer learning, prediction models trained under a set of available data from a target domain (e.g., house with limited data) can be improved through the use of data generated from similar source domains (e.g., houses with rich data). Given also the need for prediction models that can be trained online (e.g., as part of a model-predictive-control implementation), this paper introduces the generalized online transfer learning algorithm (GOTL). It employs a weighted combination of the available predictors (i.e., the target and source predictors) and guarantees convergence to the best weighted predictor. Furthermore, the use of Transfer Component Analysis (TCA) allows for using more than a single source domains, since it may facilitate the fit of a single model on more than one source domains (houses). This allows GOTL to transfer knowledge from more than one source domains. We further validate our results through experiments in climate control for residential buildings and show that GOTL may lead to non-negligible energy savings for given comfort levels.