Metabolic energy consumption of a powered lower-limb exoskeleton user mainly comes from the upper body effort since the lower body is considered to be passive. However, the upper body effort of the users is largely ignored in the literature when designing motion controllers. In this work, we use deep reinforcement learning to develop a locomotion controller that minimizes ground reaction forces (GRF) on crutches. The rationale for minimizing GRF is to reduce the upper body effort of the user. Accordingly, we design a model and a learning framework for a human-exoskeleton system with crutches. We formulate a reward function to encourage the forward displacement of a human-exoskeleton system while satisfying the predetermined constraints of a physical robot. We evaluate our new framework using Proximal Policy Optimization, a state-of-the-art deep reinforcement learning (RL) method, on the MuJoCo physics simulator with different hyperparameters and network architectures over multiple trials. We empirically show that our learning model can generate joint torques based on the joint angle, velocities, and the GRF on the feet and crutch tips. The resulting exoskeleton model can directly generate joint torques from states in line with the RL framework. Finally, we empirically show that policy trained using our method can generate a gait with a 35% reduction in GRF with respect to the baseline.