Optimizing fleet management is an important issue in ride-hailing service in terms of customer's satisfaction and increased revenue for drivers. However finding an optimal strategy in a real environment that demand and supply are changing in real time, is a challenging problem. In this paper, we try to solve this problem by using multi-agent reinforcement learning. Instead of grid which was used in existing works, we use a graph to represent road network more realistically. We model the problem using Markov game and adopt stochastic policy update which can resolve the problem of greedy policy update in multi-agent scheme. We use modified DQN to fit our problem. To approximate Q values on the graph, we use two basic graph neural networks, GCN and GAT. We design a simulator using real taxi call data and evaluate the algorithms under various conditions. The result demonstrates effectiveness of the proposed stochastic policy update model with graph neural network.