In this paper, we study the loss surface of the over-parameterized fully connected deep neural networks. We prove that for any continuous activation functions, the loss function has no bad strict local minimum, both in the regular sense and in the sense of sets. This result holds for any convex and continuous loss function, and the data samples are only required to be distinct in at least one dimension. Furthermore, we show that bad local minima do exist for a class of activation functions.