Abstract:We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian MAP segmentation. Since the Viterbi algorithm is not applicable any more, there is no simple procedure to find the MAP path, and several iterative algorithms are considered and compared. The main goal of the paper is to test the Bayesian setup against the frequentist one, where the parameters of HMM are estimated using the training data.
Abstract:In a hidden Markov model, the underlying Markov chain is usually hidden. Often, the maximum likelihood alignment (Viterbi alignment) is used as its estimate. Although having the biggest likelihood, the Viterbi alignment can behave very untypically by passing states that are at most unexpected. To avoid such situations, the Viterbi alignment can be modified by forcing it not to pass these states. In this article, an iterative procedure for improving the Viterbi alignment is proposed and studied. The iterative approach is compared with a simple bunch approach where a number of states with low probability are all replaced at the same time. It can be seen that the iterative way of adjusting the Viterbi alignment is more efficient and it has several advantages over the bunch approach. The same iterative algorithm for improving the Viterbi alignment can be used in the case of peeping, that is when it is possible to reveal hidden states. In addition, lower bounds for classification probabilities of the Viterbi alignment under different conditions on the model parameters are studied.
Abstract:We consider the maximum likelihood (Viterbi) alignment of a hidden Markov model (HMM). In an HMM, the underlying Markov chain is usually hidden and the Viterbi alignment is often used as the estimate of it. This approach will be referred to as the Viterbi segmentation. The goodness of the Viterbi segmentation can be measured by several risks. In this paper, we prove the existence of asymptotic risks. Being independent of data, the asymptotic risks can be considered as the characteristics of the model that illustrate the long-run behavior of the Viterbi segmentation.