Abstract:In self-supervised representation learning, Siamese networks are a natural architecture for learning transformation-invariance by bringing representations of positive pairs closer together. But it is prone to collapse into a degenerate solution. To address the issue, in contrastive learning, a contrastive loss is used to prevent collapse by moving representations of negative pairs away from each other. But it is known that algorithms with negative sampling are not robust to a reduction in the number of negative samples. So, on the other hand, there are algorithms that do not use negative pairs. Many positive-only algorithms adopt asymmetric network architecture consisting of source and target encoders as a key factor in coping with collapse. By exploiting the asymmetric architecture, we introduce a methodology to implicitly incorporate the idea of contrastive learning. As its implementation, we present a novel method guided stop-gradient. We apply our method to benchmark algorithms SimSiam and BYOL and show that our method stabilizes training and boosts performance. We also show that the algorithms with our method work well with small batch sizes and do not collapse even when there is no predictor. The code is available at https://github.com/bych-lee/gsg.
Abstract:To improve trust and transparency, it is crucial to be able to interpret the decisions of Deep Neural classifiers (DNNs). Instance-level examinations, such as attribution techniques, are commonly employed to interpret the model decisions. However, when interpreting misclassified decisions, human intervention may be required. Analyzing the attribu tions across each class within one instance can be particularly labor intensive and influenced by the bias of the human interpreter. In this paper, we present a novel framework to uncover the weakness of the classifier via counterfactual examples. A prober is introduced to learn the correctness of the classifier's decision in terms of binary code-hit or miss. It enables the creation of the counterfactual example concerning the prober's decision. We test the performance of our prober's misclassification detection and verify its effectiveness on the image classification benchmark datasets. Furthermore, by generating counterfactuals that penetrate the prober, we demonstrate that our framework effectively identifies vulnerabilities in the target classifier without relying on label information on the MNIST dataset.
Abstract:A temporal point process is a stochastic process that predicts which type of events is likely to happen and when the event will occur given a history of a sequence of events. There are various examples of occurrence dynamics in the daily life, and it is important to train the temporal dynamics and solve two different prediction problems, time and type predictions. Especially, deep neural network based models have outperformed the statistical models, such as Hawkes processes and Poisson processes. However, many existing approaches overfit to specific events, instead of learning and predicting various event types. Therefore, such approaches could not cope with the modified relationships between events and fail to predict the intensity functions of temporal point processes very well. In this paper, to solve these problems, we propose a variational neural temporal point process (VNTPP). We introduce the inference and the generative networks, and train a distribution of latent variable to deal with stochastic property on deep neural network. The intensity functions are computed using the distribution of latent variable so that we can predict event types and the arrival times of the events more accurately. We empirically demonstrate that our model can generalize the representations of various event types. Moreover, we show quantitatively and qualitatively that our model outperforms other deep neural network based models and statistical processes on synthetic and real-world datasets.