Transit timing variation (TTV) provides rich information about the mass and orbital properties of exoplanets, which are often obtained by solving an inverse problem via Markov Chain Monte Carlo (MCMC). In this paper, we design a new data-driven approach, which potentially can be applied to problems that are hard to traditional MCMC methods, such as the case with only one planet transiting. Specifically, we use a deep learning approach to predict the parameters of non-transit companion for the single transit system with transit information (i.e., TTV, and Transit Duration Variation (TDV)) as input. Thanks to a newly constructed \textit{Transformer}-based architecture that can extract long-range interactions from TTV sequential data, this previously difficult task can now be accomplished with high accuracy, with an overall fractional error of $\sim$2\% on mass and eccentricity.