Abstract:Recently deep learning has been successfully applied to unsupervised active learning. However, current method attempts to learn a nonlinear transformation via an auto-encoder while ignoring the sample relation, leaving huge room to design more effective representation learning mechanisms for unsupervised active learning. In this paper, we propose a novel deep unsupervised Active Learning model via Learnable Graphs, named ALLG. ALLG benefits from learning optimal graph structures to acquire better sample representation and select representative samples. To make the learnt graph structure more stable and effective, we take into account $k$-nearest neighbor graph as a priori, and learn a relation propagation graph structure. We also incorporate shortcut connections among different layers, which can alleviate the well-known over-smoothing problem to some extent. To the best of our knowledge, this is the first attempt to leverage graph structure learning for unsupervised active learning. Extensive experiments performed on six datasets demonstrate the efficacy of our method.
Abstract:Weakly supervised action localization is a challenging task with extensive applications, which aims to identify actions and the corresponding temporal intervals with only video-level annotations available. This paper analyzes the order-sensitive and location-insensitive properties of actions, and embodies them into a self-augmented learning framework to improve the weakly supervised action localization performance. To be specific, we propose a novel two-branch network architecture with intra/inter-action shuffling, referred to as ActShufNet. The intra-action shuffling branch lays out a self-supervised order prediction task to augment the video representation with inner-video relevance, whereas the inter-action shuffling branch imposes a reorganizing strategy on the existing action contents to augment the training set without resorting to any external resources. Furthermore, the global-local adversarial training is presented to enhance the model's robustness to irrelevant noises. Extensive experiments are conducted on three benchmark datasets, and the results clearly demonstrate the efficacy of the proposed method.