Abstract:Single image depth estimation is a challenging problem. The current state-of-the-art method formulates the problem as that of ordinal regression. However, the formulation is not fully differentiable and depth maps are not generated in an end-to-end fashion. The method uses a na\"ive threshold strategy to determine per-pixel depth labels, which results in significant discretization errors. For the first time, we formulate a fully differentiable ordinal regression and train the network in end-to-end fashion. This enables us to include boundary and smoothness constraints in the optimization function, leading to smooth and edge-consistent depth maps. A novel per-pixel confidence map computation for depth refinement is also proposed. Extensive evaluation of the proposed model on challenging benchmarks reveals its superiority over recent state-of-the-art methods, both quantitatively and qualitatively. Additionally, we demonstrate practical utility of the proposed method for single camera bokeh solution using in-house dataset of challenging real-life images.