Clinical data for ambulatory care, which accounts for 90% of the nations healthcare spending, is characterized by relatively small sample sizes of longitudinal data, unequal spacing between visits for each patient, with unequal numbers of data points collected across patients. While deep learning has become state-of-the-art for sequence modeling, it is unknown which methods of time aggregation may be best suited for these challenging temporal use cases. Additionally, deep models are often considered uninterpretable by physicians which may prevent the clinical adoption, even of well performing models. We show that time-distributed-dense layers combined with GRUs produce the most generalizable models. Furthermore, we provide a framework for the clinical interpretation of the models.