Crowd movement guidance has been a fascinating problem in various fields, such as easing traffic congestion in unusual events and evacuating people from an emergency-affected area. To grab the reins of crowds, there has been considerable demand for a decision support system that can answer a typical question: ``what will be the outcomes of each of the possible options in the current situation. In this paper, we consider the problem of estimating the effects of crowd movement guidance from past data. To cope with limited amount of available data biased by past decision-makers, we leverage two recent techniques in deep representation learning for spatial data analysis and causal inference. We use a spatial convolutional operator to extract effective spatial features of crowds from a small amount of data and use balanced representation learning based on the integral probability metrics to mitigate the selection bias and missing counterfactual outcomes. To evaluate the performance on estimating the treatment effects of possible guidance, we use a multi-agent simulator to generate realistic data on evacuation scenarios in a crowded theater, since there are no available datasets recording outcomes of all possible crowd movement guidance. The results of three experiments demonstrate that our proposed method reduces the estimation error by at most 56% from state-of-the-art methods.