Testing and evaluation is a critical step in the development and deployment of connected and automated vehicles (CAVs), and yet there is no systematic framework to generate testing scenario library. This paper aims to provide a general framework to solve the testing scenario library generation (TSLG) problem for different scenario types, CAV models, and performance metrics. In part I of the paper, four research questions are identified: (1) scenario description, (2) metric design, (3) library generation, and (4) CAV evaluation. To answer these questions, a unified framework is proposed. First, the operational design domain of CAVs is considered for scenario description and decision variable formulation. Second, a set of incremental performance metrics are designed including safety, functionality, mobility, and rider's comfort. Third, a new definition of criticality is proposed as a combination of maneuver challenge and exposure frequency, and a critical scenario searching method is designed based on multi-start optimization and seed-fill method. Finally, with the generated library, CAVs can be evaluated through three steps: scenario sampling, field testing, and index value estimation. The proposed framework is theoretically proved to obtain accurate evaluation results with much fewer number of tests, compared with public road test method. In part II of the paper, three case studies are investigated to demonstrate the proposed methodologies. Reinforcement learning based technique is applied to enhance the method under high-dimensional scenarios.