Abstract:The text data present in overlaid bands convey brief descriptions of news events in broadcast videos. The process of text extraction becomes challenging as overlay text is presented in widely varying formats and often with animation effects. We note that existing edge density based methods are well suited for our application on account of their simplicity and speed of operation. However, these methods are sensitive to thresholds and have high false positive rates. In this paper, we present a contrast enhancement based preprocessing stage for overlay text detection and a parameter free edge density based scheme for efficient text band detection. The second contribution of this paper is a novel approach for multiple text region tracking with a formal identification of all possible detection failure cases. The tracking stage enables us to establish the temporal presence of text bands and their linking over time. The third contribution is the adoption of Tesseract OCR for the specific task of overlay text recognition using web news articles. The proposed approach is tested and found superior on news videos acquired from three Indian English television news channels along with benchmark datasets.
Abstract:Commercial detection in news broadcast videos involves judicious selection of meaningful audio-visual feature combinations and efficient classifiers. And, this problem becomes much simpler if these combinations can be learned from the data. To this end, we propose an Multiple Kernel Learning based method for boosting successful kernel functions while ignoring the irrelevant ones. We adopt a intermediate fusion approach where, a SVM is trained with a weighted linear combination of different kernel functions instead of single kernel function. Each kernel function is characterized by a feature set and kernel type. We identify the feature sub-space locations of the prediction success of a particular classifier trained only with particular kernel function. We propose to estimate a weighing function using support vector regression (with RBF kernel) for each kernel function which has high values (near 1.0) where the classifier learned on kernel function succeeded and lower values (nearly 0.0) otherwise. Second contribution of this work is TV News Commercials Dataset of 150 Hours of News videos. Classifier trained with our proposed scheme has outperformed the baseline methods on 6 of 8 benchmark dataset and our own TV commercials dataset.