Deep learning has attracted great attention recently and yielded the state of the art performance in dimension reduction and classification problems. However, it cannot effectively handle the structured output prediction, e.g. sequential labeling. In this paper, we propose a deep learning structure, which can learn discriminative features for sequential labeling problems. More specifically, we add the inter-relationship between labels in our deep learning structure, in order to incorporate the context information from the sequential data. Thus, our model is more powerful than linear Conditional Random Fields (CRFs) because the objective function learns latent non-linear features so that target labeling can be better predicted. We pretrain the deep structure with stacked restricted Boltzmann machines (RBMs) for feature learning and optimize our objective function with online learning algorithm, a mixture of perceptron training and stochastic gradient descent. We test our model on different challenge tasks, and show that our model outperforms significantly over the completive baselines.