Artificial intelligence (AI) in radiology has made great strides in recent years, but many hurdles remain. Overfitting and lack of generalizability represent important ongoing challenges hindering accurate and dependable clinical deployment. If AI algorithms can avoid overfitting and achieve true generalizability, they can go from the research realm to the forefront of clinical work. Recently, small data AI approaches such as deep neuroevolution (DNE) have avoided overfitting small training sets. We seek to address both overfitting and generalizability by applying DNE to a virtually pooled data set consisting of images from various institutions. Our use case is classifying neuroblastoma brain metastases on MRI. Neuroblastoma is well-suited for our goals because it is a rare cancer. Hence, studying this pediatric disease requires a small data approach. As a tertiary care center, the neuroblastoma images in our local Picture Archiving and Communication System (PACS) are largely from outside institutions. These multi-institutional images provide a heterogeneous data set that can simulate real world clinical deployment. As in prior DNE work, we used a small training set, consisting of 30 normal and 30 metastasis-containing post-contrast MRI brain scans, with 37% outside images. The testing set was enriched with 83% outside images. DNE converged to a testing set accuracy of 97%. Hence, the algorithm was able to predict image class with near-perfect accuracy on a testing set that simulates real-world data. Hence, the work described here represents a considerable contribution toward clinically feasible AI.