Over the past decade the rate of care unit (CU) use in the United States has been increasing. With an aging population and ever-growing demand for medical care, effective management of patients' transitions among different care facilities will prove indispensible for shortening the length of hospital stays, improving patient outcomes, allocating critical care resources, and reducing preventable re-admissions. In this paper, we focus on an important problem of predicting the so-called "patient flow" from longitudinal electronic health records (EHRs), which has not been explored via existing machine learning techniques. By treating a sequence of transition events as a point process, we develop a novel framework for modeling patient flow through various CUs and jointly predicting patients' destination CUs and duration days. Instead of learning a generative point process model via maximum likelihood estimation, we propose a novel discriminative learning algorithm aiming at improving the prediction of transition events in the case of sparse data. By parameterizing the proposed model as a mutually-correcting process, we formulate the estimation problem via generalized linear models, which lends itself to efficient learning based on alternating direction method of multipliers (ADMM). Furthermore, we achieve simultaneous feature selection and learning by adding a group-lasso regularizer to the ADMM algorithm. Additionally, for suppressing the negative influence of data imbalance on the learning of model, we synthesize auxiliary training data for the classes with extremely few samples, and improve the robustness of our learning method accordingly. Testing on real-world data, we show that our method obtains superior performance in terms of accuracy of predicting the destination CU transition and duration of each CU occupancy.