Abstract:Childhood obesity is a major public health challenge. Obesity in early childhood and adolescence can lead to obesity and other health problems in adulthood. Early prediction and identification of the children at a high risk of developing childhood obesity may help in engaging earlier and more effective interventions to prevent and manage this and other related health conditions. Existing predictive tools designed for childhood obesity primarily rely on traditional regression-type methods without exploiting longitudinal patterns of children's data (ignoring data temporality). In this paper, we present a machine learning model specifically designed for predicting future obesity patterns from generally available items on children's medical history. To do this, we have used a large unaugmented EHR (Electronic Health Record) dataset from a major pediatric health system in the US. We adopt a general LSTM (long short-term memory) network architecture for our model for training over dynamic (sequential) and static (demographic) EHR data. We have additionally included a set embedding and attention layers to compute the feature ranking of each timestamp and attention scores of each hidden layer corresponding to each input timestamp. These feature ranking and attention scores added interpretability at both the features and the timestamp-level.