Background. Clinical parameters measured from gated single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) have value in predicting cardiac resynchronization therapy (CRT) patient outcomes, but still show limitations. The purpose of this study is to combine clinical variables, features from electrocardiogram (ECG), and parameters from assessment of cardiac function with polarmaps from gated SPECT MPI through deep learning (DL) to predict CRT response. Methods. 218 patients who underwent rest gated SPECT MPI were enrolled in this study. CRT response was defined as an increase in left ventricular ejection fraction (LVEF) > 5% at a 6-month follow up. A DL model was constructed by combining a pre-trained VGG16 module and a multilayer perceptron. Two modalities of data were input to the model: polarmap images from SPECT MPI and tabular data from clinical features and ECG parameters. Gradient-weighted Class Activation Mapping (Grad-CAM) was applied to the VGG16 module to provide explainability for the polarmaps. For comparison, four machine learning (ML) models were trained using only the tabular features. Results. Modeling was performed on 218 patients who underwent CRT implantation with a response rate of 55.5% (n = 121). The DL model demonstrated average AUC (0.83), accuracy (0.73), sensitivity (0.76), and specificity (0.69) surpassing the ML models and guideline criteria. Guideline recommendations presented accuracy (0.53), sensitivity (0.75), and specificity (0.26). Conclusions. The DL model outperformed the ML models, showcasing the additional predictive benefit of utilizing SPECT MPI polarmaps. Incorporating additional patient data directly in the form of medical imagery can improve CRT response prediction.