The structural characterization of hetero-aggregates in 3D is of great interest, e.g., for deriving process-structure or structure-property relationships. However, since 3D imaging techniques are often difficult to perform as well as time and cost intensive, a characterization of hetero-aggregates based on 2D image data is desirable, but often non-trivial. To overcome the issues of characterizing 3D structures from 2D measurements, a method is presented that relies on machine learning combined with methods of spatial stochastic modeling, where the latter are utilized for the generation of synthetic training data. This kind of training data has the advantage that time-consuming experiments for the synthesis of differently structured materials followed by their 3D imaging can be avoided. More precisely, a parametric stochastic 3D model is presented, from which a wide spectrum of virtual hetero-aggregates can be generated. Additionally, the virtual structures are passed to a physics-based simulation tool in order to generate virtual scanning transmission electron microscopy (STEM) images. The preset parameters of the 3D model together with the simulated STEM images serve as a database for the training of convolutional neural networks, which can be used to determine the parameters of the underlying 3D model and, consequently, to predict 3D structures of hetero-aggregates from 2D STEM images. Furthermore, an error analysis is performed to evaluate the prediction power of the trained neural networks with respect to structural descriptors, e.g. the hetero-coordination number.