Non-recurrent conditions caused by incidents are different from recurrent conditions that follow periodic patterns. Existing traffic speed prediction studies are incident-agnostic and use one single model to learn all possible patterns from these drastically diverse conditions. This study proposes a novel Mixture of Experts (MoE) model to improve traffic speed prediction under two separate conditions, recurrent and non-recurrent (i.e., with and without incidents). The MoE leverages separate recurrent and non-recurrent expert models (Temporal Fusion Transformers) to capture the distinct patterns of each traffic condition. Additionally, we propose a training pipeline for non-recurrent models to remedy the limited data issues. To train our model, multi-source datasets, including traffic speed, incident reports, and weather data, are integrated and processed to be informative features. Evaluations on a real road network demonstrate that the MoE achieves lower errors compared to other benchmark algorithms. The model predictions are interpreted in terms of temporal dependencies and variable importance in each condition separately to shed light on the differences between recurrent and non-recurrent conditions.