Abstract:The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet.
Abstract:The SoccerNet 2023 tracking challenge requires the detection and tracking of soccer players and the ball. In this work, we present our approach to tackle these tasks separately. We employ a state-of-the-art online multi-object tracker and a contemporary object detector for player tracking. To overcome the limitations of our online approach, we incorporate a post-processing stage using interpolation and appearance-free track merging. Additionally, an appearance-based track merging technique is used to handle the termination and creation of tracks far from the image boundaries. Ball tracking is formulated as single object detection, and a fine-tuned YOLOv8l detector with proprietary filtering improves the detection precision. Our method achieves 3rd place on the SoccerNet 2023 tracking challenge with a HOTA score of 66.27.