Abstract:Existing screening tools for early detection of autism are expensive, cumbersome, time-intensive, and sometimes fall short in predictive value. In this work, we apply Machine Learning (ML) to gold standard clinical data obtained across thousands of children at risk for autism spectrum disorders to create a low-cost, quick, and easy to apply autism screening tool that performs as well or better than most widely used standardized instruments. This new tool combines two screening methods into a single assessment, one based on short, structured parent-report questionnaires and the other on tagging key behaviors from short, semi-structured home videos of children. To overcome the scarcity, sparsity, and imbalance of training data, we apply creative feature selection, feature engineering, and novel feature encoding techniques. We allow for inconclusive determination where appropriate in order to boost screening accuracy when conclusive. We demonstrate a significant accuracy improvement over standard screening tools in a clinical study sample of 162 children.