In this work, we propose a novel unsupervised deep learning model to address multi-focus image fusion problem. First, we train an encoder-decoder network in unsupervised manner to acquire deep feature of input images. And then we utilize these features and spatial frequency to measure activity level and decision map. Finally, we apply some consistency verification methods to adjust the decision map and draw out fused result. The key point behind of proposed method is that only the objects within the depth-of-field (DOF) have sharp appearance in the photograph while other objects are likely to be blurred. In contrast to previous works, our method analyzes sharp appearance in deep feature instead of original image. Experimental results demonstrate that the proposed method achieves the state-of-art fusion performance compared to existing 16 fusion methods in objective and subjective assessment.