Abstract:Boolean circuits of McCulloch-Pitts threshold gates are a classic model of neural computation studied heavily in the late 20th century as a model of general computation. Recent advances in large-scale neural computing hardware has made their practical implementation a near-term possibility. We describe a theoretical approach for multiplying two $N$ by $N$ matrices that integrates threshold gate logic with conventional fast matrix multiplication algorithms, that perform $O(N^\omega)$ arithmetic operations for a positive constant $\omega < 3$. Our approach converts such a fast matrix multiplication algorithm into a constant-depth threshold circuit with approximately $O(N^\omega)$ gates. Prior to our work, it was not known whether the $\Theta(N^3)$-gate barrier for matrix multiplication was surmountable by constant-depth threshold circuits. Dense matrix multiplication is a core operation in convolutional neural network training. Performing this work on a neural architecture instead of off-loading it to a GPU may be an appealing option.