Terahertz (THz) communications is considered as one of key solutions to support extremely high data demand in 6G. One main difficulty of the THz communication is the severe signal attenuation caused by the foliage loss, oxygen/atmospheric absorption, body and hand losses. To compensate for the severe path loss, multiple-input-multiple-output (MIMO) antenna array-based beamforming has been widely used. Since the beams should be aligned with the signal propagation path to achieve the maximum beamforming gain, acquisition of accurate channel knowledge, i.e., channel estimation, is of great importance. An aim of this paper is to propose a new type of deep learning (DL)-based parametric channel estimation technique. In our work, DL figures out the mapping function between the received pilot signal and the sparse channel parameters characterizing the spherical domain channel. By exploiting the long short-term memory (LSTM), we can efficiently extract the temporally correlated features of sparse channel parameters and thus make an accurate estimation with relatively small pilot overhead. From the numerical experiments, we show that the proposed scheme is effective in estimating the near-field THz MIMO channel in THz downlink environments.