Abstract:In this paper, we propose a robust visual tracking method which exploits the relationships of targets in adjacent frames using patchwise joint sparse representation. Two sets of overlapping patches with different sizes are extracted from target candidates to construct two dictionaries with consideration of joint sparse representation. By applying this representation into structural sparse appearance model, we can take two-fold advantages. First, the correlation of target patches over time is considered. Second, using this local appearance model with different patch sizes takes into account local features of target thoroughly. Furthermore, the position of candidate patches and their occlusion levels are utilized simultaneously to obtain the final likelihood of target candidates. Evaluations on recent challenging benchmark show that our tracking method outperforms the state-of-the-art trackers.