We use deep reinforcement learning to explore a class of heterotic $SU(5)$ GUT models constructed from line bundle sums over Complete Intersection Calabi Yau (CICY) manifolds. We perform several experiments where A3C agents are trained to search for such models. These agents significantly outperform random exploration, in the most favourable settings by a factor of 1700 when it comes to finding unique models. Furthermore, we find evidence that the trained agents also outperform random walkers on new manifolds. We conclude that the agents detect hidden structures in the compactification data, which is partly of general nature. The experiments scale well with $h^{(1,1)}$, and may thus provide the key to model building on CICYs with large $h^{(1,1)}$.