Data living on manifolds commonly appear in many applications. Often this results from an inherently latent low-dimensional system being observed through higher dimensional measurements. We show that under certain conditions, it is possible to construct an intrinsic and isometric data representation, which respects an underlying latent intrinsic geometry. Namely, we view the observed data only as a proxy and learn the structure of a latent unobserved intrinsic manifold, whereas common practice is to learn the manifold of the observed data. For this purpose, we build a new metric and propose a method for its robust estimation by assuming mild statistical priors and by using artificial neural networks as a mechanism for metric regularization and parametrization. We show successful application to unsupervised indoor localization in ad-hoc sensor networks. Specifically, we show that our proposed method facilitates accurate localization of a moving agent from imaging data it collects. Importantly, our method is applied in the same way to two different imaging modalities, thereby demonstrating its intrinsic and modality-invariant capabilities.