In this paper, we present a data-driven approach for human pose tracking in video data. We formulate the human pose tracking problem as a discrete optimization problem based on spatio-temporal pictorial structure model and solve this problem in a greedy framework very efficiently. We propose the model to track the human pose by combining the human pose estimation from single image and traditional object tracking in a video. Our pose tracking objective function consists of the following terms: likeliness of appearance of a part within a frame, temporal displacement of the part from previous frame to the current frame, and the spatial dependency of a part with its parent in the graph structure. Experimental evaluation on benchmark datasets (VideoPose2, Poses in the Wild and Outdoor Pose) as well as on our newly build ICDPose dataset shows the usefulness of our proposed method.