Accurate knowledge of the distribution system topology and parameters is required to achieve good voltage controls, but this is difficult to obtain in practice. This paper develops a model-free approach based on the surrogate model and deep reinforcement learning (DRL). We have also extended it to deal with unbalanced three-phase scenarios. The key idea is to learn a surrogate model to capture the relationship between the power injections and voltage fluctuation of each node from historical data instead of using the original inaccurate model affected by errors and uncertainties. This allows us to integrate the DRL with the learned surrogate model. In particular, DRL is applied to learn the optimal control strategy from the experiences obtained by continuous interactions with the surrogate model. The integrated framework contains training three networks, i.e., surrogate model, actor, and critic networks, which fully leverage the strong nonlinear fitting ability of deep learning and DRL for online decision making. Several single-phase approaches have also been extended to deal with three-phase unbalance scenarios and the simulation results on the IEEE 123-bus system show that our proposed method can achieve similar performance as those that use accurate physical models.