Space exploration missions have seen use of increasingly sophisticated robotic systems with ever more autonomy. Deep learning promises to take this even a step further, and has applications for high-level tasks, like path planning, as well as low-level tasks, like motion control, which are critical components for mission efficiency and success. Using deep reinforcement end-to-end learning with randomized reward function parameters during training, we teach a simulated 8 degree-of-freedom quadruped ant-like robot to travel anywhere within a perimeter, conducting path plan and motion control on a single neural network, without any system model or prior knowledge of the terrain or environment. Our approach also allows for user specified waypoints, which could translate well to either fully autonomous or semi-autonomous/teleoperated space applications that encounter delay times. We trained the agent using randomly generated waypoints linked to the reward function and passed waypoint coordinates as inputs to the neural network. Such applications show promise on a variety of space exploration robots, including high speed rovers for fast locomotion and legged cave robots for rough terrain.