Phase association groups seismic wave arrivals according to their originating earthquakes. It is a fundamental task in a seismic data processing pipeline, but challenging to perform for smaller, high-rate seismic events which carry fundamental information about earthquake dynamics, especially with a commonly assumed inaccurate wave speed model. As a consequence, most association methods focus on larger events that occur at a lower rate and are thus easier to associate, even though microseismicity provides a valuable description of the elastic medium properties in the subsurface. In this paper, we show that association is possible at rates much higher than previously reported even when the wave speed is unknown. We propose Harpa, a high-rate seismic phase association method which leverages deep neural fields to build generative models of wave speeds and associated travel times, and first solves a joint spatio--temporal source localization and wave speed recovery problem, followed by association. We obviate the need for associated phases by interpreting arrival time data as probability measures and using an optimal transport loss to enforce data fidelity. The joint recovery problem is known to admit a unique solution under certain conditions but due to the non-convexity of the corresponding loss a simple gradient scheme converges to poor local minima. We show that this is effectively mitigated by stochastic gradient Langevin dynamics (SGLD). Numerical experiments show that \harpa~efficiently associates high-rate seismicity clouds over complex, unknown wave speeds and graciously handles noisy and missing picks.