We propose a semi-supervised learning framework for monocular depth estimation. Compared to existing semi-supervised learning methods, which inherit limitations of both sparse supervised and unsupervised loss functions, we achieve the complementary advantages of both loss functions, by building two separate network branches for each loss and distilling each other through the mutual distillation loss function. We also present to apply different data augmentation to each branch, which improves the robustness. We conduct experiments to demonstrate the effectiveness of our framework over the latest methods and provide extensive ablation studies.