Modeling the temporal behavior of data is of primordial importance in many scientific and engineering fields. The baseline method assumes that both the dynamic and observation models follow linear-Gaussian models. Non-linear extensions lead to intractable solvers. It is also possible to consider several linear models, or a piecewise linear model, and to combine them with a switching mechanism, which is also intractable because of the exponential explosion of the number of Gaussian components. In this paper, we propose a variational approximation of piecewise linear dynamic systems. We provide full details of the derivation of a variational expectation-maximization algorithm that can be used either as a filter or as a smoother. We show that the model parameters can be split into two sets, a set of static (or observation parameters) and a set of dynamic parameters. The immediate consequences are that the former set can be estimated off-line and that the number of linear models (or the number of states of the switching variable) can be learned based on model selection. We apply the proposed method to the problem of visual tracking and we thoroughly compare our algorithm with several visual trackers applied to the problem of head-pose estimation.