Double difference earthquake relocation is an essential component of many earthquake catalog development workflows. This technique produces high-resolution relative relocations between events by minimizing differential measurements of the arrival times of waves from nearby sources, which highlights the resolution of faults and improves interpretation of seismic activity. The inverse problem is typically solved iteratively using conjugate-gradient minimization, however the cost scales significantly with the total number of sources and stations considered. Here we propose a Graph Neural Network (GNN) based earthquake double-difference relocation framework, Graph Double Difference (GraphDD), that is trained to minimize the double-difference residuals of a catalog to locate earthquakes. Through batching and sampling the method can scale to arbitrarily large catalogs. Our architecture uses one graph to represent the stations, a second graph to represent the sources, and creates the Cartesian product graph between the two graphs to capture the relationships between the stations and sources (e.g., the residuals and travel time partial derivatives). This key feature allows a natural architecture that can be used to minimize the double-difference residuals. We implement our model on several distinct test cases including seismicity from northern California, Turkiye, and northern Chile, which have highly variable data quality, and station and source distributions. We obtain high resolution relocations in these tests, and our model shows adaptability to variable types of loss functions and location objectives, including learning station corrections and mapping into the reference frame of a different catalog. Our results suggest that a GNN approach to double-difference relocation is a promising direction for scaling to very large catalogs and gaining new insights into the relocation problem.