Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks. In this paper, we aim to find collision-free paths for multiple cellular-connected UAVs, while satisfying requirements of connectivity with ground base stations (GBSs) in the presence of a dynamic jammer. We first formulate the problem as a sequential decision making problem in discrete domain, with connectivity, collision avoidance, and kinematic constraints. We, then, propose an offline temporal difference (TD) learning algorithm with online signal-to-interference-plus-noise ratio (SINR) mapping to solve the problem. More specifically, a value network is constructed and trained offline by TD method to encode the interactions among the UAVs and between the UAVs and the environment; and an online SINR mapping deep neural network (DNN) is designed and trained by supervised learning, to encode the influence and changes due to the jammer. Numerical results show that, without any information on the jammer, the proposed algorithm can achieve performance levels close to that of the ideal scenario with the perfect SINR-map. Real-time navigation for multi-UAVs can be efficiently performed with high success rates, and collisions are avoided.