Abnormal behavior detection in surveillance video is a pivotal part of the intelligent city. Most existing methods only consider how to detect anomalies, with less considering to explain the reason of the anomalies. We investigate an orthogonal perspective based on the reason of these abnormal behaviors. To this end, we propose a multivariate fusion method that analyzes each target through three branches: object, action and motion. The object branch focuses on the appearance information, the motion branch focuses on the distribution of the motion features, and the action branch focuses on the action category of the target. The information that these branches focus on is different, and they can complement each other and jointly detect abnormal behavior. The final abnormal score can then be obtained by combining the abnormal scores of the three branches.