A deep learning model is presented to nowcast the occurrence of lightning at a five-minute time resolution 60 minutes into the future. The model is based on a recurrent-convolutional architecture that allows it to recognize and predict the spatiotemporal development of convection, including the motion, growth and decay of thunderstorm cells. The predictions are performed on a stationary grid, without the use of storm object detection and tracking. The input data, collected from an area in and surrounding Switzerland, comprise ground-based radar data, visible/infrared satellite data and derived cloud products, lightning detection, numerical weather prediction and digital elevation model data. We analyze different alternative loss functions, class weighting strategies and model features, providing guidelines for future studies to select loss functions optimally and to properly calibrate the probabilistic predictions of their model. Based on these analyses, we use focal loss in this study, but conclude that it only provides a small benefit over cross entropy, which is a viable option if recalibration of the model is not practical.