Abstract:Chest X-ray (CXR) is the most typical medical image worldwide to examine various thoracic diseases. Automatically localizing lesions from CXR is a promising way to alleviate radiologists' daily reading burden. However, CXR datasets often have numerous image-level annotations and scarce lesion-level annotations, and more often, without annotations. Thus far, unifying different supervision granularities to develop thoracic disease detection algorithms has not been comprehensively addressed. In this paper, we present OXnet, the first deep omni-supervised thoracic disease detection network to our best knowledge that uses as much available supervision as possible for CXR diagnosis. Besides fully supervised learning, to enable learning from weakly-annotated data, we guide the information from a global classification branch to the lesion localization branch by a dual attention alignment module. To further enhance global information learning, we impose intra-class compactness and inter-class separability with a global prototype alignment module. For unsupervised data learning, we extend the focal loss to be its soft form to distill knowledge from a teacher model. Extensive experiments show the proposed OXnet outperforms competitive methods with significant margins. Further, we investigate omni-supervision under various annotation granularities and corroborate OXnet is a promising choice to mitigate the plight of annotation shortage for medical image diagnosis.