It is well-known that the problem of finding the optimal beamformers in massive multiple-input multiple-output (MIMO) networks is challenging because of its non-convexity, and conventional optimization based algorithms suffer from high computational costs. While computationally efficient deep learning based methods have been proposed, their complexity heavily relies upon system parameters such as the number of transmit antennas, and therefore these methods typically do not generalize well when deployed in heterogeneous scenarios where the base stations (BSs) are equipped with different numbers of transmit antennas and have different inter-BS distances. This paper proposes a novel deep learning based beamforming algorithm to address the above challenges. Specifically, we consider the weighted sum rate (WSR) maximization problem in multi-input and single-output (MISO) interference channels, and propose a deep neural network architecture by unfolding a parallel gradient projection algorithm. Somewhat surprisingly, by leveraging the low-dimensional structures of the optimal beamforming solution, our constructed neural network can be made independent of the numbers of transmit antennas and BSs. Moreover, such a design can be further extended to a cooperative multicell network. Numerical results based on both synthetic and ray-tracing channel models show that the proposed neural network can achieve high WSRs with significantly reduced runtime, while exhibiting favorable generalization capability with respect to the antenna number, BS number and the inter-BS distance.