Training data is an essential resource for creating capable and robust vision systems which are integral to the proper function of many robotic systems. Synthesized training data has been shown in recent years to be a viable alternative to manually collecting and labelling data. In order to meet the rising popularity of synthetic image training data we propose a framework for defining synthetic image data pipelines. Additionally we survey the literature to identify the most promising candidates for components of the proposed pipeline. We propose that defining such a pipeline will be beneficial in reducing development cycles and coordinating future research.