Anomaly detection in videos has been attracting an increasing amount of attention. Despite the competitive performance of recent methods on benchmark datasets, they typically lack desirable features such as modularity, cross-domain adaptivity, interpretability, and real-time anomalous event detection. Furthermore, current state-of-the-art approaches are evaluated using the standard instance-based detection metric by considering video frames as independent instances, which is not ideal for video anomaly detection. Motivated by these research gaps, we propose a modular and unified approach to the online video anomaly detection and localization problem, called MOVAD, which consists of a novel transfer learning based plug-and-play architecture, a sequential anomaly detector, a mathematical framework for selecting the detection threshold, and a suitable performance metric for real-time anomalous event detection in videos. Extensive performance evaluations on benchmark datasets show that the proposed framework significantly outperforms the current state-of-the-art approaches.