A major challenge for autonomous vehicles is interacting with other traffic participants safely and smoothly. A promising approach to handle such traffic interactions is equipping autonomous vehicles with interaction-aware controllers (IACs). These controllers predict how surrounding human drivers will respond to the autonomous vehicle's actions, based on a driver model. However, the predictive validity of driver models used in IACs is rarely validated, which can limit the interactive capabilities of IACs outside the simple simulated environments in which they are demonstrated. In this paper, we argue that besides evaluating the interactive capabilities of IACs, their underlying driver models should be validated on natural human driving behavior. We propose a workflow for this validation that includes scenario-based data extraction and a two-stage (tactical/operational) evaluation procedure based on human factors literature. We demonstrate this workflow in a case study on an inverse-reinforcement-learning-based driver model replicated from an existing IAC. This model only showed the correct tactical behavior in 40% of the predictions. The model's operational behavior was inconsistent with observed human behavior. The case study illustrates that a principled evaluation workflow is useful and needed. We believe that our workflow will support the development of appropriate driver models for future automated vehicles.