Abstract:Making the most of next-generation galaxy clustering surveys requires overcoming challenges in complex, non-linear modelling to access the significant amount of information at smaller cosmological scales. Field-level inference has provided a unique opportunity beyond summary statistics to use all of the information of the galaxy distribution. However, addressing current challenges often necessitates numerical modelling that incorporates non-differentiable components, hindering the use of efficient gradient-based inference methods. In this paper, we introduce Learning the Universe by Learning to Optimize (LULO), a gradient-free framework for reconstructing the 3D cosmic initial conditions. Our approach advances deep learning to train an optimization algorithm capable of fitting state-of-the-art non-differentiable simulators to data at the field level. Importantly, the neural optimizer solely acts as a search engine in an iterative scheme, always maintaining full physics simulations in the loop, ensuring scalability and reliability. We demonstrate the method by accurately reconstructing initial conditions from $M_{200\mathrm{c}}$ halos identified in a dark matter-only $N$-body simulation with a spherical overdensity algorithm. The derived dark matter and halo overdensity fields exhibit $\geq80\%$ cross-correlation with the ground truth into the non-linear regime $k \sim 1h$ Mpc$^{-1}$. Additional cosmological tests reveal accurate recovery of the power spectra, bispectra, halo mass function, and velocities. With this work, we demonstrate a promising path forward to non-linear field-level inference surpassing the requirement of a differentiable physics model.