An extension of the Hawkes process, the Marked Hawkes process distinguishes itself by featuring variable jump size across each event, in contrast to the constant jump size observed in a Hawkes process without marks. While extensive literature has been dedicated to the non-parametric estimation of both the linear and non-linear Hawkes process, there remains a significant gap in the literature regarding the marked Hawkes process. In response to this, we propose a methodology for estimating the conditional intensity of the marked Hawkes process. We introduce two distinct models: \textit{Shallow Neural Hawkes with marks}- for Hawkes processes with excitatory kernels and \textit{Neural Network for Non-Linear Hawkes with Marks}- for non-linear Hawkes processes. Both these approaches take the past arrival times and their corresponding marks as the input to obtain the arrival intensity. This approach is entirely non-parametric, preserving the interpretability associated with the marked Hawkes process. To validate the efficacy of our method, we subject the method to synthetic datasets with known ground truth. Additionally, we apply our method to model cryptocurrency order book data, demonstrating its applicability to real-world scenarios.