Face deepfake detection has seen impressive results recently. Nearly all existing deep learning techniques for face deepfake detection are fully supervised and require labels during training. In this paper, we design a novel deepfake detection method via unsupervised contrastive learning. We first generate two different transformed versions of an image and feed them into two sequential sub-networks, i.e., an encoder and a projection head. The unsupervised training is achieved by maximizing the correspondence degree of the outputs of the projection head. To evaluate the detection performance of our unsupervised method, we further use the unsupervised features to train an efficient linear classification network. Extensive experiments show that our unsupervised learning method enables comparable detection performance to state-of-the-art supervised techniques, in both the intra- and inter-dataset settings. We also conduct ablation studies for our method.