This paper introduces the system submitted by dun_oscar team for the ICPR MSR Challenge. Three subsystems for task1-task3 are descripted respectively. In task1, we develop a visual system which includes a OCR model, a text tracker, and a NLP classifier for distinguishing subtitles and non-subtitles. In task2, we employ an ASR system which includes an AM with 18 layers and a 4-gram LM. Semi-supervised learning on unlabeled data is also vital. In task3, we employ the ASR system to improve the visual system, some false subtitles can be corrected by a fusion module.