Recently, self-supervised learning (SSL) methods have been used in pre-training the segmentation models for 2D and 3D medical images. Most of these methods are based on reconstruction, contrastive learning and consistency regularization. However, the spatial correspondence of 2D slices from a 3D medical image has not been fully exploited. In this paper, we propose a novel self-supervised alignment learning framework to pre-train the neural network for medical image segmentation. The proposed framework consists of a new local alignment loss and a global positional loss. We observe that in the same 3D scan, two close 2D slices usually contain similar anatomic structures. Thus, the local alignment loss is proposed to make the pixel-level features of matched structures close to each other. Experimental results show that the proposed alignment learning is competitive with existing self-supervised pre-training approaches on CT and MRI datasets, under the setting of limited annotations.