Abstract:Handling large amounts of data has become a key for developing automated driving systems. Especially for developing highly automated driving functions, working with images has become increasingly challenging due to the sheer size of the required data. Such data has to satisfy different requirements to be usable in machine learning-based approaches. Thus, engineers need to fully understand their large image data sets for the development and test of machine learning algorithms. However, current approaches lack automatability, are not generic and are limited in their expressiveness. Hence, this paper aims to analyze a state-of-the-art text and image embedding neural network and guides through the application in the automotive domain. This approach enables the search for similar images and the search based on a human understandable text-based description. Our experiments show the automatability and generalizability of our proposed method for handling large data sets in the automotive domain.