Abstract:As the Computer Vision community rapidly develops and advances algorithms for autonomous driving systems, the goal of safer and more efficient autonomous transportation is becoming increasingly achievable. However, it is 2024, and we still do not have fully self-driving cars. One of the remaining core challenges lies in addressing the novelty problem, where self-driving systems still struggle to handle previously unseen situations on the open road. With our Challenge of Out-Of-Label (COOOL) benchmark, we introduce a novel dataset for hazard detection, offering versatile evaluation metrics applicable across various tasks, including novelty-adjacent domains such as Anomaly Detection, Open-Set Recognition, Open Vocabulary, and Domain Adaptation. COOOL comprises over 200 collections of dashcam-oriented videos, annotated by human labelers to identify objects of interest and potential driving hazards. It includes a diverse range of hazards and nuisance objects. Due to the dataset's size and data complexity, COOOL serves exclusively as an evaluation benchmark.
Abstract:Tracking the trajectory of tennis players can help camera operators in production. Predicting future movement enables cameras to automatically track and predict a player's future trajectory without human intervention. Predicting future human movement in the context of complex physical tasks is also intellectually satisfying. Swift advancements in sports analytics and the wide availability of videos for tennis have inspired us to propose a novel method called Pose2Trajectory, which predicts a tennis player's future trajectory as a sequence derived from their body joints' data and ball position. Demonstrating impressive accuracy, our approach capitalizes on body joint information to provide a comprehensive understanding of the human body's geometry and motion, thereby enhancing the prediction of the player's trajectory. We use encoder-decoder Transformer architecture trained on the joints and trajectory information of the players with ball positions. The predicted sequence can provide information to help close-up cameras to keep tracking the tennis player, following centroid coordinates. We generate a high-quality dataset from multiple videos to assist tennis player movement prediction using object detection and human pose estimation methods. It contains bounding boxes and joint information for tennis players and ball positions in singles tennis games. Our method shows promising results in predicting the tennis player's movement trajectory with different sequence prediction lengths using the joints and trajectory information with the ball position.