Abstract:Informed by the basic geometry underlying feed forward neural networks, we initialize the weights of the first layer of a neural network using the linear discriminants which best distinguish individual classes. Networks initialized in this way take fewer training steps to reach the same level of training, and asymptotically have higher accuracy on training data.