Background: Cardiac resynchronization therapy (CRT) has emerged as an effective treatment for heart failure patients with electrical dyssynchrony. However, accurately predicting which patients will respond to CRT remains a challenge. This study explores the application of deep transfer learning techniques to train a predictive model for CRT response. Methods: In this study, the short-time Fourier transform (STFT) technique was employed to transform ECG signals into two-dimensional images. A transfer learning approach was then applied on the MIT-BIT ECG database to pre-train a convolutional neural network (CNN) model. The model was fine-tuned to extract relevant features from the ECG images, and then tested on our dataset of CRT patients to predict their response. Results: Seventy-one CRT patients were enrolled in this study. The transfer learning model achieved an accuracy of 72% in distinguishing responders from non-responders in the local dataset. Furthermore, the model showed good sensitivity (0.78) and specificity (0.79) in identifying CRT responders. The performance of our model outperformed clinic guidelines and traditional machine learning approaches. Conclusion: The utilization of ECG images as input and leveraging the power of transfer learning allows for improved accuracy in identifying CRT responders. This approach offers potential for enhancing patient selection and improving outcomes of CRT.