This paper considers the problem of sequential fusion of predictions from neural networks (NN) and fusion of predictions from multiple NN. This fusion strategy increases the robustness, i.e., reduces the impact of one incorrect classification and detection of outliers the \nn has not seen during training. This paper uses Laplacian approximation of Bayesian NNs (BNNs) to quantify the uncertainty necessary for fusion. Here, an extension is proposed such that the prediction of the NN can be represented by multimodal distributions. Regarding calibration of the estimated uncertainty in the prediction, the performance is significantly improved by having the flexibility to represent a multimodal distribution. Two class classical image classification tasks, i.e., MNIST and CFAR10, and image sequences from camera traps of carnivores in Swedish forests have been used to demonstrate the fusion strategies and proposed extension to the Laplacian approximation.