Data-nulling superimposed pilot (DNSP) effectively alleviates the superimposed interference of superimposed training (ST)-based channel estimation (CE) in orthogonal frequency division multiplexing (OFDM) systems, while facing the challenges of the estimation accuracy and computational complexity. By developing the promising solutions of deep learning (DL) in the physical layer of wireless communication, we fuse the DNSP and DL to tackle these challenges in this paper. Nevertheless, due to the changes of wireless scenarios, the model mismatch of DL leads to the performance degradation of CE, and thus faces the issue of network retraining. To address this issue, a lightweight transfer learning (TL) network is further proposed for the DL-based DNSP scheme, and thus structures a TL-based CE in OFDM systems. Specifically, based on the linear receiver, the least squares estimation is first employed to extract the initial features of CE. With the extracted features, we develop a convolutional neural network (CNN) to fuse the solutions of DLbased CE and the CE of DNSP. Finally, a lightweight TL network is constructed to address the model mismatch. To this end, a novel CE network for the DNSP scheme in OFDM systems is structured, which improves its estimation accuracy and alleviates the model mismatch. The experimental results show that in all signal-to-noise-ratio (SNR) regions, the proposed method achieves lower normalized mean squared error (NMSE) than the existing DNSP schemes with minimum mean square error (MMSE)-based CE. For example, when the SNR is 0 decibel (dB), the proposed scheme achieves similar NMSE as that of the MMSE-based CE scheme at 20 dB, thereby significantly improving the estimation accuracy of CE. In addition, relative to the existing schemes, the improvement of the proposed scheme presents its robustness against the impacts of parameter variations.