Abstract:Generating representative rear-end crash scenarios is crucial for safety assessments of Advanced Driver Assistance Systems (ADAS) and Automated Driving systems (ADS). However, existing methods for scenario generation face challenges such as limited and biased in-depth crash data and difficulties in validation. This study sought to overcome these challenges by combining naturalistic driving data and pre-crash kinematics data from rear-end crashes. The combined dataset was weighted to create a representative dataset of rear-end crash characteristics across the full severity range in the United States. Multivariate distribution models were built for the combined dataset, and a driver behavior model for the following vehicle was created by combining two existing models. Simulations were conducted to generate a set of synthetic rear-end crash scenarios, which were then weighted to create a representative synthetic rear-end crash dataset. Finally, the synthetic dataset was validated by comparing the distributions of parameters and the outcomes (Delta-v, the total change in vehicle velocity over the duration of the crash event) of the generated crashes with those in the original combined dataset. The synthetic crash dataset can be used for the safety assessments of ADAS and ADS and as a benchmark when evaluating the representativeness of scenarios generated through other methods.