Abstract:This work addresses camera selection, the task of predicting which camera should be "on air" from multiple candidate cameras for soccer broadcast. The task is challenging because of the scarcity of learning data with all candidate views. Meanwhile, broadcast videos are freely available on the Internet (e.g. Youtube). However, these videos only record the selected camera views, omitting the other candidate views. To overcome this problem, we first introduce a random survival forest (RSF) method to impute the incomplete data effectively. Then, we propose a spatial-appearance heatmap to describe foreground objects (e.g. players and balls) in an image. To evaluate the performance of our system, we collect the largest-ever dataset for soccer broadcasting camera selection. It has one main game which has all candidate views and twelve auxiliary games which only have the broadcast view. Our method significantly outperforms state-of-the-art methods on this challenging dataset. Further analysis suggests that the improvement in performance is indeed from the extra information from auxiliary games.