Visual tracking from an unmanned aerial vehicle (UAV) poses challenges such as occlusions or background clutter. In order to achieve more robust on-board UAV visual tracking, a pipeline combining information extracted from a visual tracker and a sparse 3D reconstruction of the static environment is introduced. The 3D reconstruction is based on an image-based structure-from-motion (SfM) component and thus allows to utilize a state estimator in a pseudo-3D space. Thereby improved handling of occlusion situations and background clutter is realized. Evaluation is done on prototypical image sequences captured from a UAV with low-altitude oblique views. The experimental results demonstrate the benefit of the proposed approach compared to only relying on visual cues or using a state estimation in the image space.