Extremely large-scale massive multiple-input-multiple-output (XL-MIMO) is regarded as a promising technology for next-generation communication systems. In order to enhance the beamforming gains, codebook-based beam training is widely adopted in XL-MIMO systems. However, in XL-MIMO systems, the near-field domain expands, and near-field codebook should be adopted for beam training, which significantly increases the pilot overhead. To tackle this problem, we propose a deep learning-based beam training scheme where the near-field channel model and the near-field codebook are considered. To be specific, we first utilize the received signals corresponding to the far-field wide beams to estimate the optimal near-field beam. Two training schemes are proposed, namely the proposed original and the improved neural networks. The original scheme estimates the optimal near-field codeword directly based on the output of the neural networks. By contrast, the improved scheme performs additional beam testing, which can significantly improve the performance of beam training. Finally, the simulation results show that our proposed schemes can significantly reduce the training overhead in the near-field domain and achieve beamforming gains.