Abstract:The examination of the musculoskeletal system in dogs is a challenging task in veterinary practice. In this work, a novel method has been developed that enables efficient documentation of a dog's condition through a visual representation. However, since the visual documentation is new, there is no existing training data. The objective of this work is therefore to mitigate the impact of data scarcity in order to develop an AI-based diagnostic support system. To this end, the potential of synthetic data that mimics realistic visual documentations of diseases for pre-training AI models is investigated. We propose a method for generating synthetic image data that mimics realistic visual documentations. Initially, a basic dataset containing three distinct classes is generated, followed by the creation of a more sophisticated dataset containing 36 different classes. Both datasets are used for the pre-training of an AI model. Subsequently, an evaluation dataset is created, consisting of 250 manually created visual documentations for five different diseases. This dataset, along with a subset containing 25 examples. The obtained results on the evaluation dataset containing 25 examples demonstrate a significant enhancement of approximately 10% in diagnosis accuracy when utilizing generated synthetic images that mimic real-world visual documentations. However, these results do not hold true for the larger evaluation dataset containing 250 examples, indicating that the advantages of using synthetic data for pre-training an AI model emerge primarily when dealing with few examples of visual documentations for a given disease. Overall, this work provides valuable insights into mitigating the limitations imposed by limited training data through the strategic use of generated synthetic data, presenting an approach applicable beyond the canine musculoskeletal assessment domain.