RGB-Thermal object tracking attempt to locate target object using complementary visual and thermal infrared data. Existing RGB-T trackers fuse different modalities by robust feature representation learning or adaptive modal weighting. However, how to integrate dual attention mechanism for visual tracking is still a subject that has not been studied yet. In this paper, we propose two visual attention mechanisms for robust RGB-T object tracking. Specifically, the local attention is implemented by exploiting the common visual attention of RGB and thermal data to train deep classifiers. We also introduce the global attention, which is a multi-modal target-driven attention estimation network. It can provide global proposals for the classifier together with local proposals extracted from previous tracking result. Extensive experiments on two RGB-T benchmark datasets validated the effectiveness of our proposed algorithm.