It is important to build a rigorous verification and validation (V&V) process to evaluate the safety of highly automated vehicles (HAVs) before their wide deployment on public roads. In this paper, we propose an interaction-aware framework for HAV safety evaluation which is suitable for some highly-interactive driving scenarios including highway merging, roundabout entering, etc. Contrary to existing approaches where the primary other vehicle (POV) takes predetermined maneuvers, we model the POV as a game-theoretic agent. To capture a wide variety of interactions between the POV and the vehicle under test (VUT), we characterize the interactive behavior using level-k game theory and social value orientation and train a diverse set of POVs using reinforcement learning. Moreover, we propose an adaptive test case sampling scheme based on the Gaussian process regression technique to generate customized and diverse challenging cases. The highway merging is used as the example scenario. We found the proposed method is able to capture a wide range of POV behaviors and achieve better coverage of the failure modes of the VUT compared with other evaluation approaches.