The usage of environment sensor models for virtual testing is a promising approach to reduce the testing effort of autonomous driving. However, in order to deduce any statements regarding the performance of an autonomous driving function based on simulation, the sensor model has to be validated to determine the discrepancy between the synthetic and real sensor data. Since a certain degree of divergence can be assumed to exist, the sufficient level of fidelity must be determined, which poses a major challenge. In particular, a method for quantifying the fidelity of a sensor model does not exist and the problem of defining an appropriate metric remains. In this work, we train a neural network to distinguish real and simulated radar sensor data with the purpose of learning the latent features of real radar point clouds. Furthermore, we propose the classifier's confidence score for the `real radar point cloud' class as a metric to determine the degree of fidelity of synthetically generated radar data. The presented approach is evaluated and it can be demonstrated that the proposed deep evaluation metric outperforms conventional metrics in terms of its capability to identify characteristic differences between real and simulated radar data.