Motivated by neural network training in low-bit floating and fixed-point environments, this work studies the convergence of variants of SGD with computational error. Considering a general stochastic Lipschitz continuous loss function, a novel convergence result to a Clarke stationary point is presented assuming that only an approximation of its stochastic gradient can be computed as well as error in computing the SGD step itself. Different variants of SGD are then tested empirically in a variety of low-precision arithmetic environments, with improved test set accuracy achieved compared to SGD for two image recognition tasks.