In this paper, based on the spirit of Fitted Q-Iteration (FQI), we propose a Gauss-Newton Temporal Difference (GNTD) method to solve the Q-value estimation problem with function approximation. In each iteration, unlike the original FQI that solves a nonlinear least square subproblem to fit the Q-iteration, the GNTD method can be viewed as an \emph{inexact} FQI that takes only one Gauss-Newton step to optimize this subproblem, which is much cheaper in computation. Compared to the popular Temporal Difference (TD) learning, which can be viewed as taking a single gradient descent step to FQI's subproblem per iteration, the Gauss-Newton step of GNTD better retains the structure of FQI and hence leads to better convergence. In our work, we derive the finite-sample non-asymptotic convergence of GNTD under linear, neural network, and general smooth function approximations. In particular, recent works on neural TD only guarantee a suboptimal $\mathcal{\mathcal{O}}(\epsilon^{-4})$ sample complexity, while GNTD obtains an improved complexity of $\tilde{\mathcal{O}}(\epsilon^{-2})$. Finally, we validate our method via extensive experiments in both online and offline RL problems. Our method exhibits both higher rewards and faster convergence than TD-type methods, including DQN.