We present a machine learning framework to train and validate neural networks to predict the anisotropic elastic response of the monoclinic organic molecular crystal $\beta$-HMX in the geometrical nonlinear regime. A filtered molecular dynamic (MD) simulations database is used to train the neural networks with a Sobolev norm that uses the stress measure and a reference configuration to deduce the elastic stored energy functional. To improve the accuracy of the elasticity tangent predictions originating from the learned stored energy, a transfer learning technique is used to introduce additional tangential constraints from the data while necessary conditions (e.g. strong ellipticity, crystallographic symmetry) for the correctness of the model are either introduced as additional physical constraints or incorporated in the validation tests. Assessment of the neural networks is based on (1) the accuracy with which they reproduce the bottom-line constitutive responses predicted by MD, (2) detailed examination of their stability and uniqueness, and (3) admissibility of the predicted responses with respect to continuum mechanics theory in the finite-deformation regime. We compare the neural networks' training efficiency under different Sobolev constraints and assess the models' accuracy and robustness against MD benchmarks for $\beta$-HMX.