Mention detection is an important aspect of the annotation task and interpretation process for applications such as coreference resolution. In this work, we propose and compare three neural network-based approaches to mention detection. The first approach is based on the mention detection part of a state-of-the-art coreference resolution system; the second uses ELMo embeddings together with a bidirectional LSTM and a biaffine classifier; the third approach uses the recently introduced BERT model. Our best model (using a biaffine classifier) achieved gains of up to 1.8 percentage points on mention recall when compared with a strong baseline in a HIGH RECALL setting. The same model achieved improvements of up to 5.3 and 6.5 p.p. when compared with the best-reported mention detection F1 on thevCONLL and CRAC data sets respectively in a HIGH F1 setting. We further evaluated our models on coreference resolution by using mentions predicted by our best model in the start-of-the-art coreference systems. The enhanced model achieved absolute improvements of up to 1.7 and 0.7 p.p. when compared with the best pipeline system and the state-of-the-art end-to-end system respectively.