Quantum machine learning has emerged as a potential practical application of near-term quantum devices. In this work, we study a two-layer hybrid classical-quantum classifier in which a first layer of quantum stochastic neurons implementing generalized linear models (QGLMs) is followed by a second classical combining layer. The input to the first, hidden, layer is obtained via amplitude encoding in order to leverage the exponential size of the fan-in of the quantum neurons in the number of qubits per neuron. To facilitate implementation of the QGLMs, all weights and activations are binary. While the state of the art on training strategies for this class of models is limited to exhaustive search and single-neuron perceptron-like bit-flip strategies, this letter introduces a stochastic variational optimization approach that enables the joint training of quantum and classical layers via stochastic gradient descent. Experiments show the advantages of the approach for a variety of activation functions implemented by QGLM neurons.