Accurate gravity field models are essential for safe proximity operations around small bodies. State-of-the-art techniques use spherical harmonics or high-fidelity polyhedron shape models. Unfortunately, these techniques can become inaccurate near the surface of the small body or have high computational costs, especially for binary or heterogeneous small bodies. New learning-based techniques do not encode a predefined structure and are more versatile. In exchange for versatility, learning-based techniques can be less robust outside the training data domain. In deployment, the spacecraft trajectory is the primary source of dynamics data. Therefore, the training data domain should include spacecraft trajectories to accurately evaluate the learned model's safety and robustness. We have developed a novel method for learning-based gravity models that directly uses the spacecraft's past trajectories. We further introduce a method to evaluate the safety and robustness of learning-based techniques via comparing accuracy within and outside of the training domain. We demonstrate this safety and robustness method for two learning-based frameworks: Gaussian processes and neural networks. Along with the detailed analysis provided, we empirically establish the need for robustness verification of learned gravity models when used for proximity operations.