Velocity picking, a critical step in seismic data processing, has been studied for decades. Although manual picking can produce accurate normal moveout (NMO) velocities from the velocity spectra of prestack gathers, it is time-consuming and becomes infeasible with the emergence of large amount of seismic data. Numerous automatic velocity picking methods have thus been developed. In recent years, deep learning (DL) methods have produced good results on the seismic data with medium and high signal-to-noise ratios (SNR). Unfortunately, it still lacks a picking method to automatically generate accurate velocities in the situations of low SNR. In this paper, we propose a multi-information fusion network (MIFN) to estimate stacking velocity from the fusion information of velocity spectra and stack gather segments (SGS). In particular, we transform the velocity picking problem into a semantic segmentation problem based on the velocity spectrum images. Meanwhile, the information provided by SGS is used as a prior in the network to assist segmentation. The experimental results on two field datasets show that the picking results of MIFN are stable and accurate for the scenarios with medium and high SNR, and it also performs well in low SNR scenarios.