To reap the promising benefits of massive multiple-input multiple-output (MIMO) systems, accurate channel state information (CSI) is required through channel estimation. However, due to the complicated wireless propagation environment and large-scale antenna arrays, precise channel estimation for massive MIMO systems is significantly challenging and costs an enormous training overhead. Considerable time-frequency resources are consumed to acquire sufficient accuracy of CSI, which thus severely degrades systems' spectral and energy efficiencies. In this paper, we propose a dual-attention-based channel estimation network (DACEN) to realize accurate channel estimation via low-density pilots, by decoupling the spatial-temporal domain features of massive MIMO channels with the temporal attention module and the spatial attention module. To further improve the estimation accuracy, we propose a parameter-instance transfer learning approach based on the DACEN to transfer the channel knowledge learned from the high-density pilots pre-acquired during the training dataset collection period. Experimental results on a publicly available dataset reveal that the proposed DACEN-based method with low-density pilots ($\rho_L=6/52$) achieves better channel estimation performance than the existing methods even with higher-density pilots ($\rho_H=26/52$). Additionally, with the proposed transfer learning approach, the DACEN-based method with ultra-low-density pilots ($\rho_L^\prime=2/52$) achieves higher estimation accuracy than the existing methods with low-density pilots, thereby demonstrating the effectiveness and the superiority of the proposed method.