Hyper-parameter optimization is a crucial problem in machine learning as it aims to achieve the state-of-the-art performance in any model. Great efforts have been made in this field, such as random search, grid search, Bayesian optimization. In this paper, we model hyper-parameter optimization process as a Markov decision process, and tackle it with reinforcement learning. A novel hyper-parameter optimization method based on soft actor critic and hierarchical mixture regularization has been proposed. Experiments show that the proposed method can obtain better hyper-parameters in a shorter time.