Abstract:Healthcare data continues to flourish yet a relatively small portion, mostly structured, is being utilized effectively for predicting clinical outcomes. The rich subjective information available in unstructured clinical notes can possibly facilitate higher discrimination but tends to be under-utilized in mortality prediction. This work attempts to assess the gain in performance when multiple notes that have been minimally preprocessed are used as an input for prediction. A hierarchical architecture consisting of both convolutional and recurrent layers is used to concurrently model the different notes compiled in an individual hospital stay. This approach is evaluated on predicting in-hospital mortality on the MIMIC-III dataset. On comparison to approaches utilizing structured data, it achieved higher metrics despite requiring less cleaning and preprocessing. This demonstrates the potential of unstructured data in enhancing mortality prediction and signifies the need to incorporate more raw unstructured data into current clinical prediction methods.