The accuracy and explainability of data-driven nowcasting models are of great importance in many socio-economic sectors reliant on weather-dependent decision making. This paper proposes a novel architecture called Small Attention Residual UNet (SAR-UNet) for precipitation and cloud cover nowcasting. Here, SmaAt-UNet is used as a core model and is further equipped with residual connections, parallel to the depthwise separable convolutions. The proposed SAR-UNet model is evaluated on two datasets, i.e., Dutch precipitation maps ranging from 2016 to 2019 and French cloud cover binary images from 2017 to 2018. The obtained results show that SAR-UNet outperforms other examined models in precipitation nowcasting from 30 to 180 minutes in the future as well as cloud cover nowcasting in the next 90 minutes. Furthermore, we provide additional insights on the nowcasts made by our proposed model using Grad-CAM, a visual explanation technique, which is employed on different levels of the encoder and decoder paths of the SAR-UNet model and produces heatmaps highlighting the critical regions in the input image as well as intermediate representations to the precipitation. The heatmaps generated by Grad-CAM reveal the interactions between the residual connections and the depthwise separable convolutions inside of the multiple depthwise separable blocks placed throughout the network architecture.