With the increasing safety validation requirements for the release of a self-driving car, alternative approaches, such as simulation-based testing, are emerging in addition to conventional real-world testing. In order to rely on virtual tests the employed sensor models have to be validated. For this reason, it is necessary to quantify the discrepancy between simulation and reality in order to determine whether a certain fidelity is sufficient for a desired intended use. There exists no sound method to measure this simulation-to-reality gap of radar perception for autonomous driving. We address this problem by introducing a multi-layered evaluation approach, which consists of a combination of an explicit and an implicit sensor model evaluation. The former directly evaluates the realism of the synthetically generated sensor data, while the latter refers to an evaluation of a downstream target application. In order to demonstrate the method, we evaluated the fidelity of three typical radar model types (ideal, data-driven, ray tracing-based) and their applicability for virtually testing radar-based multi-object tracking. We have shown the effectiveness of the proposed approach in terms of providing an in-depth sensor model assessment that renders existing disparities visible and enables a realistic estimation of the overall model fidelity across different scenarios.