We introduce a novel tracking-by-detection framework to track multiple objects in overhead camera videos for airport checkpoint security scenarios where targets correspond to passengers and their baggage items. Our approach improves object detection by employing a test-time data augmentation procedure that provides multiple geometrically transformed images as inputs to a convolutional neural network. We cluster the multiple detections generated by the network using the mean-shift algorithm. The multiple hypothesis tracking algorithm then keeps track of the temporal identifiers of the targets based on the cluster centroids. Our method also incorporates a trajectory association mechanism to maintain the consistency of the temporal identifiers as passengers travel across camera views. Finally, we also introduce a simple distance-based matching mechanism to associate passengers with their luggage. An evaluation of detection, tracking, and association performances on videos obtained from multiple overhead cameras in a realistic airport checkpoint environment demonstrates the effectiveness of the proposed approach.