Model-based Vol/VAR optimization method is widely used to eliminate voltage violations and reduce network losses. However, the parameters of active distribution networks(ADNs) are not onsite identified, so significant errors may be involved in the model and make the model-based method infeasible. To cope with this critical issue, we propose a novel two-stage deep reinforcement learning (DRL) method to improve the voltage profile by regulating inverter-based energy resources, which consists of offline stage and online stage. In the offline stage, a highly efficient adversarial reinforcement learning algorithm is developed to train an offline agent robust to the model mismatch. In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency. Numerical simulations on IEEE test cases not only demonstrate that the proposed adversarial reinforcement learning algorithm outperforms the state-of-art algorithm, but also show that our proposed two-stage method achieves much better performance than the existing DRL based methods in the online application.