In this paper, we use a machine learning approach to predict the stationary distributions of the number of customers in a single-staiton multi server system. We consider two systems, the first is $c$ homogeneous servers, namely the $GI/GI/c$ queue. The second is a two-heterogeneous server system, namely the $GI/GI_i/2$ queue. We train a neural network for these queueing models, using the first four inter-arrival and service time moments. We demonstrate empirically that using the fifth moment and beyond does not increase accuracy. Compared to existing methods, we show that in terms of the stationary distribution and the mean value of the number of customers in a $GI/GI/c$ queue, we are state-of-the-art. Further, we are the only ones to predict the stationary distribution of the number of customers in the system in a $GI/GI_i/2$ queue. We conduct a thorough performance evaluation to assert that our model is accurate. In most cases, we demonstrate that our error is less than 5\%. Finally, we show that making inferences is very fast, where 5000 inferences can be made in parallel within a fraction of a second.