This paper investigates deep learning techniques to predict transmit beamforming based on only historical channel data without current channel information in the multiuser multiple-input-single-output downlink. This will significantly reduce the channel estimation overhead and improve the spectrum efficiency especially in high-mobility vehicular communications. Specifically, we propose a joint learning framework that incorporates channel prediction and power optimization, and produces prediction for transmit beamforming directly. In addition, we propose to use the attention mechanism in the Long Short-Term Memory Recurrent Neural Networks to improve the accuracy of channel prediction. Simulation results using both a simple autoregressive process model and the more realistic 3GPP spatial channel model verify that our proposed predictive beamforming scheme can significantly improve the effective spectrum efficiency compared to traditional channel estimation and the method that separately predicts channel and then optimizes beamforming.