Clustering using deep neural network models have been extensively studied in recent years. Among the most popular frameworks are the VAE and GAN frameworks, which learns latent feature representations of data through encoder / decoder neural net structures. This is a suitable base for clustering tasks, as the latent space often seems to effectively capture the inherent essence of data, simplifying its manifold and reducing noise. In this article, the VAE framework is used to investigate how probability function gradient ascent over data points can be used to process data in order to achieve better clustering. Improvements in classification is observed comparing with unprocessed data, although state of the art results are not obtained. Processing data with gradient descent however results in more distinct cluster separation, making it simpler to investigate suitable hyper parameter settings such as the number of clusters. We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm, and demonstrate that cluster number determination is facilitated with gradient processing. As an additional curiosity, we find that our baseline model used for comparison; a GMM on a t-SNE latent space for a VAE structure with weight one on reconstruction during training (autoencoder), yield state of the art results on the MNIST data, to our knowledge not beaten by any other existing model.