Computed Tomography (CT) of the temporal bone has become an important method for diagnosing ear diseases. Due to the different posture of the subject and the settings of CT scanners, the CT image of the human temporal bone should be geometrically calibrated to ensure the symmetry of the bilateral anatomical structure. Manual calibration is a time-consuming task for radiologists and an important pre-processing step for further computer-aided CT analysis. We propose an automatic calibration algorithm for temporal bone CT images. The lateral semicircular canals (LSCs) are segmented as anchors at first. Then, we define a standard 3D coordinate system. The key step is the LSC segmentation. We design a novel 3D LSC segmentation encoder-decoder network, which introduces a 3D dilated convolution and a multi-pooling scheme for feature fusion in the encoding stage. The experimental results show that our LSC segmentation network achieved a higher segmentation accuracy. Our proposed method can help to perform calibration of temporal bone CT images efficiently.