Abstract:In this paper, we proposed two different approaches, a rule-based approach and a machine-learning based approach, to identify active heart failure cases automatically by analyzing electronic health records (EHR). For the rule-based approach, we extracted cardiovascular data elements from clinical notes and matched patients to different colors according their heart failure condition by using rules provided by experts in heart failure. It achieved 69.4% accuracy and 0.729 F1-Score. For the machine learning approach, with bigram of clinical notes as features, we tried four different models while SVM with linear kernel achieved the best performance with 87.5% accuracy and 0.86 F1-Score. Also, from the classification comparison between the four different models, we believe that linear models fit better for this problem. Once we combine the machine-learning and rule-based algorithms, we will enable hospital-wide surveillance of active heart failure through increased accuracy and interpretability of the outputs.