Abstract:This research project explores the optimization of the family selection process for participation in Uruguay's Crece Contigo Family Support Program (PAF) through machine learning. An anonymized database of 15,436 previous referral cases was analyzed, focusing on pregnant women and children under four years of age. The main objective was to develop a predictive algorithm capable of determining whether a family meets the conditions for acceptance into the program. The implementation of this model seeks to streamline the evaluation process and allow for more efficient resource allocation, allocating more team time to direct support. The study included an exhaustive data analysis and the implementation of various machine learning models, including Neural Networks (NN), XGBoost (XGB), LSTM, and ensemble models. Techniques to address class imbalance, such as SMOTE and RUS, were applied, as well as decision threshold optimization to improve prediction accuracy and balance. The results demonstrate the potential of these techniques for efficient classification of families requiring assistance.
Abstract:Clinical trials predicate subject eligibility on a diversity of criteria ranging from patient demographics to food allergies. Trials post their requirements as semantically complex, unstructured free-text. Formalizing trial criteria to a computer-interpretable syntax would facilitate eligibility determination. In this paper, we investigate an information extraction (IE) approach for grounding criteria from trials in ClinicalTrials(dot)gov to a shared knowledge base. We frame the problem as a novel knowledge base population task, and implement a solution combining machine learning and context free grammar. To our knowledge, this work is the first criteria extraction system to apply attention-based conditional random field architecture for named entity recognition (NER), and word2vec embedding clustering for named entity linking (NEL). We release the resources and core components of our system on GitHub at https://github.com/facebookresearch/Clinical-Trial-Parser. Finally, we report our per module and end to end performances; we conclude that our system is competitive with Criteria2Query, which we view as the current state-of-the-art in criteria extraction.