Occupancy maps are widely recognized as an efficient method for facilitating robot motion planning in static environments. However, for intelligent vehicles, occupancy of both the present and future moments is required to ensure safe driving. In the automotive industry, the accurate and continuous prediction of future occupancy maps in traffic scenarios remains a formidable challenge. This paper investigates multi-sensor spatio-temporal fusion strategies for continuous occupancy prediction in a systematic manner. This paper presents FusionMotion, a novel bird's eye view (BEV) occupancy predictor which is capable of achieving the fusion of asynchronous multi-sensor data and predicting the future occupancy map with variable time intervals and temporal horizons. Remarkably, FusionMotion features the adoption of neural ordinary differential equations on recurrent neural networks for occupancy prediction. FusionMotion learns derivatives of BEV features over temporal horizons, updates the implicit sensor's BEV feature measurements and propagates future states for each ODE step. Extensive experiments on large-scale nuScenes and Lyft L5 datasets demonstrate that FusionMotion significantly outperforms previous methods. In addition, it outperforms the BEVFusion-style fusion strategy on the Lyft L5 dataset while reducing synchronization requirements. Codes and models will be made available.