Abstract:We present a numerical method to learn an accurate predictive model for an unknown stochastic dynamical system from its trajectory data. The method seeks to approximate the unknown flow map of the underlying system. It employs the idea of autoencoder to identify the unobserved latent random variables. In our approach, we design an encoding function to discover the latent variables, which are modeled as unit Gaussian, and a decoding function to reconstruct the future states of the system. Both the encoder and decoder are expressed as deep neural networks (DNNs). Once the DNNs are trained by the trajectory data, the decoder serves as a predictive model for the unknown stochastic system. Through an extensive set of numerical examples, we demonstrate that the method is able to produce long-term system predictions by using short bursts of trajectory data. It is also applicable to systems driven by non-Gaussian noises.
Abstract:Process mining is a relatively new subject which builds a bridge between traditional process modelling and data mining. Process discovery is one of the most critical parts of process mining which aims at discovering process models automatically from event logs. The performance of existing process discovery algorithms can be affected when there are missing activity labels in event logs. Several methods have been proposed to repair missing activity labels, but their accuracy can drop when a large number of activity labels are missing. In this paper, we propose a LSTM-based prediction model to predict the missing activity labels in event logs. The proposed model takes both the prefix and suffix sequences of the events with missing activity labels as input. Additional attributes of event logs are also utilised to improve the performance. Our evaluation on several publicly available datasets show that the proposed method performed consistently better than existing methods to repair missing activity labels in event logs.
Abstract:Process mining is a relatively new subject which builds a bridge between process modelling and data mining. An exclusive choice in a process model usually splits the process into different branches. However, in some processes, it is possible to switch from one branch to another. The inductive miner guarantees to return sound process models, but fails to return a precise model when there are switch behaviours between different exclusive choice branches due to the limitation of process trees. In this paper, we present a novel extension to the process tree model to support switch behaviours between different branches of the exclusive choice operator and propose a novel extension to the inductive miner to discover sound process models with switch behaviours. The proposed discovery technique utilizes the theory of a previous study to detect possible switch behaviours. We apply both artificial and publicly-available datasets to evaluate our approach. Our results show that our approach can improve the precision of discovered models by 36% while maintaining high fitness values compared to the original inductive miner.
Abstract:The insights revealed from process mining heavily rely on the quality of event logs. Activities extracted from different data sources or the free-text nature within the same system may lead to inconsistent labels. Such inconsistency would then lead to redundancy of activity labels, which refer to labels that have different syntax but share the same behaviours. The identifications of these labels from data-driven process discovery are difficult and would rely heavily on human intervention. In this paper, we propose an approach to detect redundant activity labels using control-flow relations and data values from event logs. We have evaluated our approach using two publicly available logs and also a case study using the MIMIC-III data set. The results demonstrate that our approach can detect redundant activity labels even with low occurrence frequencies. This approach can value-add to the preprocessing step to generate more representative event logs for process mining tasks.
Abstract:Process mining acts as a valuable tool to analyse the behaviour of an organisation by offering techniques to discover, monitor and enhance real processes. The key to process mining is to discovery understandable process models. However, real-life logs can be complex with redundant activities, which share similar behaviour but have different syntax. We show that the existence of such redundant activities heavily affects the quality of discovered process models. Existing approaches filter activities by frequency, which cannot solve problems caused by redundant activities. In this paper, we propose first to discover redundant activities in the log level and, then, use the discovery results to simplify event logs. Two publicly available data sets are used to evaluate the usability of our approach in real-life processes. Our approach can be adopted as a preprocessing step before applying any discovery algorithms to produce simplify models.
Abstract:Business processes are continuously evolving in order to adapt to changes due to various factors. One type of process changes are branching frequency changes, which are related to changes in frequencies between different options when there is an exclusive choice. Existing methods either cannot detect such changes or cannot provide accurate and comprehensive results. In this paper, we propose a method which takes both event logs and process models as input and generates a choice sequence for each exclusive choice in the process model. The method then identifies change points based on the choice sequences. We evaluate our method on a real-life event log. Results show that our method can identify branching frequency changes in process models and provide comprehensive results to users.