Abstract:Making the most of next-generation galaxy clustering surveys requires overcoming challenges in complex, non-linear modelling to access the significant amount of information at smaller cosmological scales. Field-level inference has provided a unique opportunity beyond summary statistics to use all of the information of the galaxy distribution. However, addressing current challenges often necessitates numerical modelling that incorporates non-differentiable components, hindering the use of efficient gradient-based inference methods. In this paper, we introduce Learning the Universe by Learning to Optimize (LULO), a gradient-free framework for reconstructing the 3D cosmic initial conditions. Our approach advances deep learning to train an optimization algorithm capable of fitting state-of-the-art non-differentiable simulators to data at the field level. Importantly, the neural optimizer solely acts as a search engine in an iterative scheme, always maintaining full physics simulations in the loop, ensuring scalability and reliability. We demonstrate the method by accurately reconstructing initial conditions from $M_{200\mathrm{c}}$ halos identified in a dark matter-only $N$-body simulation with a spherical overdensity algorithm. The derived dark matter and halo overdensity fields exhibit $\geq80\%$ cross-correlation with the ground truth into the non-linear regime $k \sim 1h$ Mpc$^{-1}$. Additional cosmological tests reveal accurate recovery of the power spectra, bispectra, halo mass function, and velocities. With this work, we demonstrate a promising path forward to non-linear field-level inference surpassing the requirement of a differentiable physics model.
Abstract:$N$-body simulations are computationally expensive, so machine-learning (ML)-based emulation techniques have emerged as a way to increase their speed. Although fast, surrogate models have limited trustworthiness due to potentially substantial emulation errors that current approaches cannot correct for. To alleviate this problem, we introduce COmoving Computer Acceleration (COCA), a hybrid framework interfacing ML with an $N$-body simulator. The correct physical equations of motion are solved in an emulated frame of reference, so that any emulation error is corrected by design. This approach corresponds to solving for the perturbation of particle trajectories around the machine-learnt solution, which is computationally cheaper than obtaining the full solution, yet is guaranteed to converge to the truth as one increases the number of force evaluations. Although applicable to any ML algorithm and $N$-body simulator, this approach is assessed in the particular case of particle-mesh cosmological simulations in a frame of reference predicted by a convolutional neural network, where the time dependence is encoded as an additional input parameter to the network. COCA efficiently reduces emulation errors in particle trajectories, requiring far fewer force evaluations than running the corresponding simulation without ML. We obtain accurate final density and velocity fields for a reduced computational budget. We demonstrate that this method shows robustness when applied to examples outside the range of the training data. When compared to the direct emulation of the Lagrangian displacement field using the same training resources, COCA's ability to correct emulation errors results in more accurate predictions. COCA makes $N$-body simulations cheaper by skipping unnecessary force evaluations, while still solving the correct equations of motion and correcting for emulation errors made by ML.