In this paper we study an emerging class of neural networks based on the morphological operators of dilation and erosion. We explore these networks mathematically from a tropical geometry perspective as well as mathematical morphology. Our contributions are threefold. First, we examine the training of morphological networks via Difference-of-Convex programming methods and extend a binary morphological classifier to multiclass tasks. Second, we focus on the sparsity of dense morphological networks trained via gradient descent algorithms and compare their performance to their linear counterparts under heavy pruning, showing that the morphological networks cope far better and are characterized with superior compression capabilities. Our approach incorporates the effect of the training optimizer used and offers quantitative and qualitative explanations. Finally, we study how the architectural structure of a morphological network can affect shape constraints, focusing on monotonicity. Via Maslov Dequantization, we obtain a softened version of a known architecture and show how this approach can improve training convergence and performance.