Datasets are essential to apply AI algorithms to Cyber Physical System (CPS) Security. Due to scarcity of real CPS datasets, researchers elected to generate their own datasets using either real or virtualized testbeds. However, unlike other AI domains, a CPS is a complex system with many interfaces that determine its behavior. A dataset that comprises merely a collection of sensor measurements and network traffic may not be sufficient to develop resilient AI defensive or offensive agents. In this paper, we study the \emph{elements} of CPS security datasets required to capture the system behavior and interactions, and propose a dataset architecture that has the potential to enhance the performance of AI algorithms in securing cyber physical systems. The framework includes dataset elements, attack representation, and required dataset features. We compare existing datasets to the proposed architecture to identify the current limitations and discuss the future of CPS dataset generation using testbeds.