Saliency maps have been widely used to interpret deep learning classifiers for Alzheimer's disease (AD). However, since AD is heterogeneous and has multiple subtypes, the pathological mechanism of AD remains not fully understood and may vary from patient to patient. Due to the lack of such understanding, it is difficult to comprehensively and effectively assess the saliency map of AD classifier. In this paper, we utilize the anatomical segmentation to allocate saliency values into different brain regions. By plotting the distributions of saliency maps corresponding to AD and NC (Normal Control), we can gain a comprehensive view of the model's decisions process. In order to leverage the fact that the brain volume shrinkage happens in AD patients during disease progression, we define a new evaluation metric, brain volume change score (VCS), by computing the average Pearson correlation of the brain volume changes and the saliency values of a model in different brain regions for each patient. Thus, the VCS metric can help us gain some knowledge of how saliency maps resulting from different models relate to the changes of the volumes across different regions in the whole brain. We trained candidate models on the ADNI dataset and tested on three different datasets. Our results indicate: (i) models with higher VCSs tend to demonstrate saliency maps with more details relevant to the AD pathology, (ii) using gradient-based adversarial training strategies such as FGSM and stochastic masking can improve the VCSs of the models.