Highway traffic modeling and forecasting approaches are critical for intelligent transportation systems. Recently, deep-learning-based traffic forecasting methods have emerged as state of the art for a wide range of traffic forecasting tasks. However, these methods require a large amount of training data, which needs to be collected over a significant period of time. This can present a number of challenges for the development and deployment of data-driven learning methods for highway networks that suffer from lack of historical data. A promising approach to address this issue is transfer learning, where a model trained on one part of the highway network can be adapted for a different part of the highway network. We focus on diffusion convolutional recurrent neural network (DCRNN), a state-of-the-art graph neural network for highway network forecasting. It models the complex spatial and temporal dynamics of the highway network using a graph-based diffusion convolution operation within a recurrent neural network. DCRNN cannot perform transfer learning, however, because it learns location-specific traffic patterns, which cannot be used for unseen regions of the network. To that end, we develop a new transfer learning approach for DCRNN, where a single model trained on data-rich regions of the highway network can be used to forecast traffic on unseen regions of the highway network. We evaluate the ability of our approach to forecast the traffic on the entire California highway network with one year of time series data. We show that TL-DCRNN can learn from several regions of the California highway network and forecast the traffic on the unseen regions of the network with high accuracy. Moreover, we demonstrate that TL-DCRNN can learn from San Francisco region traffic data and can forecast traffic on the Los Angeles region and vice versa.