Abstract:In this paper, a family of neural network-based survival models is presented. The models are specified based on piecewise definitions of the hazard function and the density function on a partitioning of the time; both constant and linear piecewise definitions are presented, resulting in a family of four models. The models can be seen as an extension of the commonly used discrete-time and piecewise exponential models and thereby add flexibility to this set of standard models. Using a simulated dataset the models are shown to perform well compared to the highly expressive, state-of-the-art energy-based model, while only requiring a fraction of the computation time.
Abstract:Accurate predictions of when a component will fail are crucial when planning maintenance, and by modeling the distribution of these failure times, survival models have shown to be particularly useful in this context. The presented methodology is based on conventional neural network-based survival models that are trained using data that is continuously gathered and stored at specific times, called snapshots. An important property of this type of training data is that it can contain more than one snapshot from a specific individual which results in that standard maximum likelihood training can not be directly applied since the data is not independent. However, the papers show that if the data is in a specific format where all snapshot times are the same for all individuals, called homogeneously sampled, maximum likelihood training can be applied and produce desirable results. In many cases, the data is not homogeneously sampled and in this case, it is proposed to resample the data to make it homogeneously sampled. How densely the dataset is sampled turns out to be an important parameter; it should be chosen large enough to produce good results, but this also increases the size of the dataset which makes training slow. To reduce the number of samples needed during training, the paper also proposes a technique to, instead of resampling the dataset once before the training starts, randomly resample the dataset at the start of each epoch during the training. The proposed methodology is evaluated on both a simulated dataset and an experimental dataset of starter battery failures. The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models. The results also show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.