Recently, deep learning-based compressive imaging (DCI) has surpassed the conventional compressive imaging in reconstruction quality and faster running time. While multi-scale has shown superior performance over single-scale, research in DCI has been limited to single-scale sampling. Despite training with single-scale images, DCI tends to favor low-frequency components similar to the conventional multi-scale sampling, especially at low subrate. From this perspective, it would be easier for the network to learn multi-scale features with a multi-scale sampling architecture. In this work, we proposed a multi-scale deep compressive imaging (MS-DCI) framework which jointly learns to decompose, sample, and reconstruct images at multi-scale. A three-phase end-to-end training scheme was introduced with an initial and two enhance reconstruction phases to demonstrate the efficiency of multi-scale sampling and further improve the reconstruction performance. We analyzed the decomposition methods (including Pyramid, Wavelet, and Scale-space), sampling matrices, and measurements and showed the empirical benefit of MS-DCI which consistently outperforms both conventional and deep learning-based approaches.