Efficient prediction of internet traffic is an essential part of Self Organizing Network (SON) for ensuring proactive management. There are many existing solutions for internet traffic prediction with higher accuracy using deep learning. But designing individual predictive models for each service provider in the network is challenging due to data heterogeneity, scarcity, and abnormality. Moreover, the performance of the deep sequence model in network traffic prediction with limited training data has not been studied extensively in the current works. In this paper, we investigated and evaluated the performance of the deep transfer learning technique in traffic prediction with inadequate historical data leveraging the knowledge of our pre-trained model. First, we used a comparatively larger real-world traffic dataset for source domain prediction based on five different deep sequence models: Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), LSTM Encoder-Decoder (LSTM_En_De), LSTM_En_De with Attention layer (LSTM_En_De_Atn), and Gated Recurrent Unit (GRU). Then, two best-performing models, LSTM_En_De and LSTM_En_De_Atn, from the source domain with an accuracy of 96.06% and 96.05% are considered for the target domain prediction. Finally, four smaller traffic datasets collected for four particular sources and destination pairs are used in the target domain to compare the performance of the standard learning and transfer learning in terms of accuracy and execution time. According to our experimental result, transfer learning helps to reduce the execution time for most cases, while the model's accuracy is improved in transfer learning with a larger training session.